Op Ed: The Many Faces of Sharding for Blockchain Scalability
Any programmer who has ever sat down to build a DApp at one point has had to think about the limits of current public blockchains, the most important and obvious one being their limited throughput, i.e., the number of transactions processed per second. In order to run a DApp that can handle real-world throughput requirements, blockchains must become scalable.
One answer to blockchain scaling is sharding. Sharding promises to increase the throughput by changing the way blocks get validated by the network. The key feature of sharding that makes it unique among all (on-chain) scaling solutions is horizontal scaling, i.e., the throughput increases as the mining network expands. This particular characteristic of sharding may make it the ideal fuel to spur rapid adoption of blockchain technology.
This article will briefly discuss the scaling issues with existing blockchain platforms — briefly only, because most readers must already be familiar with it. It will then discuss how sharding and its different forms can be a promising solution to the scaling problem. It will also touch upon some of the theoretical and practical challenges to implementing sharding and how some of these challenges can be overcome.
Scalability Issues With Existing Blockchains
One of the biggest problems that public blockchain platforms face today is scalability. All popular platforms are struggling to handle a larger number of transactions per second. In fact, today the public Ethereum and Bitcoin networks can handle 7-10 transactions per second on average. These figures are far inferior to those of centralized payment processors like Visa, which processes roughly 8,000 transactions per second on average.
Slow transaction processing creates a major problem because they choke up the networks, making it difficult to use the blockchain for applications such as real-time payments. The longer a payment takes to be processed, the more inconvenient it becomes for the end user; this is one of the main reasons why payment methods like PayPal and credit cards like Visa are still much more attractive. As more complex DApps start to rely on the same network, the problems caused by slower transaction speed will only compound.
From a more technical standpoint, all blockchain consensus protocols have a challenging limitation: Every fully participating node in the network must validate every transaction and must seek agreement from other nodes on it, and this is the component of blockchain technology that creates distributed ledgers and makes it secure.
In most chains like Bitcoin and Ethereum, nodes are run by the public. While the decentralized consensus mechanism provides some vital advantages such as fault tolerance, security, political neutrality and authenticity, this method to verify chains comes at the cost of scalability. It will take more and more processing power to verify these public blockchains as they get larger, and this may create bottlenecks in these networks and slow down the creation of new applications.
Sharding: Divide and Conquer
Sharding is a scaling technique that was inspired by a traditional concept of database sharding, whereby a database is partitioned into several pieces and placed on different servers. In the context of a public blockchain, the transaction load on the network would be divided into different shards comprising different nodes on the network. As a consequence, each node would process only a fraction of incoming transactions, and it would do so in parallel with other nodes on the network. Breaking the network into shards would result in more transactions being processed and verified simultaneously. As a result, it becomes possible to process more and more transactions as the network grows. This property is also referred to as horizontal scaling.
We could imagine that existing blockchains operate like a busy highway with one toll station operating on only one toll booth. The result would be a traffic jam as people wait in long lines to pass the toll station. Implementing a sharding-based blockchain is like adding 15 or 20 toll booths to the highway. It would dramatically improve the rate at which traffic can progress through the stations. Sharding would make a tremendous amount of difference and dramatically improve transaction speed.
The implementation of sharding-based blockchains could have various benefits for public blockchains. First, thousands of transactions or even more could be processed every single second, changing the way people feel about the efficiency of cryptocurrencies as payment methods. Improving transaction throughput will bring more and more users and applications to decentralized systems, and this will, in turn, advocate further adoption of blockchains, making mining more profitable and attract more nodes to public networks, creating a virtuous cycle.
Furthermore, sharding could help bring down transaction fees since less processing will be needed to validate a single transaction; nodes can charge smaller fees and still be profitable to run. Coupling low fees with high transaction processing capability, public chains will become increasingly attractive to real-world use cases. The more these positive trends continue, the more mainstream adoption we’ll see of cryptocurrencies and blockchain applications in general.
This is the basic concept, but there are more granular ways to implement sharding strategies like network and transaction sharding, and state sharding. With network and transaction sharding, the network of blockchain nodes is split into different shards, with each shard formed to process and reach consensus on a different subset of transactions. This way, unconnected subsets of transactions can be processed in parallel, significantly boosting the transaction throughput by orders of magnitude.
On the other hand, on today’s mainstream public blockchains, the burden of storing transactions, smart contracts and various states is borne by all public nodes, which could make it prohibitively expensive in terms of required storage space to maintain ongoing operations on the blockchain.
One potential approach, called state sharding, has been proposed to resolve this issue. The crux is to divide the entire storage into pieces and let different shards store different parts; thus every node is only responsible for hosting its own shard’s data instead of the complete blockchain state.
Complexities Underlying Sharding
While all the different forms of sharding may be very intuitive, unspooling the technical details can reveal the complexity of the approaches and the underlying challenges. Some of these challenges are easy to overcome, while others not quite so. Generally speaking, network and transaction sharding are easier to accomplish while state sharding is much more complicated. Below, for the different sharding mechanisms, we categorically discuss some of these challenges and how feasible are they to be overcome.
The first and foremost challenge in sharding is the creation of shards. A mechanism will need to be developed to determine which nodes reside in which shard in a secure way in order to avoid possible attacks from someone who gains a lot of control over a particular shard.
The best approach to beat an adversary (at least in most of the cases) is through randomness. By leveraging randomness, it should become possible for the network to randomly sample nodes to form a shard. Random sampling prevents malicious nodes from overpopulating a single shard.
But, where should the randomness come from? The most readily available source of public randomness is in blocks, for instance, the Merkle tree root of transactions. The randomness available in blocks is publicly verifiable and (close to) uniform random bits can be extracted from it through randomness extractors.
However, simply having a randomized mechanism to assign nodes to a shard is not sufficient. One must also ensure that the network agrees on the members in a shard. This can be achieved through a consensus protocol like proof of work, for example.
Transaction sharding isn’t as simple as it may sound. Consider introducing transaction sharding in a Bitcoin-like system (without smart contracts), where the state of the system is defined using UTXOs. Let us suppose that the network is already composed of shards and a user sends out a transaction. The transaction has two inputs and one output. Now, how should this transaction be assigned to a shard?
The most intuitive approach would be to decide on the shard based on the last few bits of the transaction hash. For instance, if the last bit of the hash is 0, then the transaction is assigned to the first shard, else it is assigned to the second shard (assuming we have only two shards). This allows the transaction to be validated within a single shard. However, if the user is malicious, he may create another transaction with the same two inputs but a different output — yes, a double spend. The second transaction will have a different hash and, hence, the two transactions may end up in different shards. Each shard will then separately validate the received transaction while being oblivious of the double-spend transaction being validated in the other shard.
In order to prevent the double spend, the shards will have to communicate with each other while the validation is in progress. In fact, since the double-spend transaction may land in any shard, a given shard receiving a transaction will have to communicate with every other shard. The communication overhead may, in fact, defeat the entire purpose of transaction sharding.
On the other hand, the problem is much simpler to solve when we have an account-based system (without smart contracts). Each transaction then will have a sender’s address and can then be assigned to a shard based on the sender’s address. This ensures that two double-spend transactions will get validated in the same shard and hence can be easily detected without any cross-shard communication.
With the promises of state sharding come a new set of challenges. As a matter of fact, state sharding is the most challenging of all sharding proposals so far.
Continuing with our account-based model (let us not bring in smart contracts for the moment), in a state-sharded blockchain, a specific shard will only maintain a portion of the state. For instance, if we have two shards and only two user accounts, say for Alice and Bob, respectively, then each shard will keep the balance of one single user.
Imagine that Alice creates a transaction to pay Bob. The transaction will be handled by the first shard. Once the transaction is validated, the information about Bob’s new balance must be shared with his shard. If two popular accounts are handled by different shards, then this may entail frequent cross-shard communication and state exchange. Ensuring that cross-shard communication will not outweigh the performance gains from state sharding is still an open research problem.
One possible way to reduce the cross-shard communication overhead is to restrict users from making cross-shard transactions. With our example, this would mean that Alice would not be allowed to transact directly with Bob. If ever Alice has to transact with Bob, she will have to hold an account in that shard. While this does eliminate any cross-shard communication, it may limit the usability of the platform somewhat.
The second challenge with state sharding is data availability. Consider a scenario where, for some reason, a given shard is attacked and goes offline. Since the state of the system is not replicated across all shards, the network can no longer validate transactions that have dependency on the offline shard. As a result, the blockchain may become largely unusable. A solution to this problem is to maintain archival or backup nodes that can help the network troubleshoot and recover from data unavailability. However, those nodes will then have to store the entire state of the system and hence may introduce centralization risks.
Another point to consider in any sharding mechanism (certainly not specific to state sharding) is to ensure that shards are not static for resilience against attacks and failures; the network must accept new nodes and assign them in a random manner to different shards. In other words, the network must get reshuffled once in a while.
However, reshuffling in the case of state sharding is tricky. Since each shard only maintains a portion of the state, reshuffling the network in one go may render the entire system unavailable until some synchronization is completed. To prevent outage, the network must be reshuffled gradually to ensure that every shard has enough old nodes before a node is evicted.
Similarly, once a new node joins a shard, one has to ensure that the node is given ample time to sync with the state of the shard; otherwise the incoming node will reject outright every single transaction.
In conclusion, sharding is definitely an exciting and promising direction for blockchains to pursue in order to solve scalability problems without compromising decentralization and transparency. However, there is no doubt that sharding, particularly state sharding, is notoriously difficult to do right both at the design level and at the implementation level.
Sharding should be handled with care. Also, more research needs to be done to establish the viability of state sharding as it may not be the silver bullet to storage problems. Researchers and developers are actively seeking alternate solutions at this moment. And perhaps, the answer is just right around the corner.
This is a guest post by Dr. Yaoqi Jia, head of technology at Zilliqa. Views expressed are his own and do not necessarily reflect those of BTC Media or Bitcoin Magazine.