Ensuring Network Scalibility: How to Fight Blockchain Bloat
With more and more users turning to Bitcoin and Chief Scientist Gavin Andresen having proposed a hard fork of the blockchain, the issue of network scalability has once again risen to the surface. The problem is called “blockchain bloat”: when more transactions are made, the blockchain has more data to record, and if it grows too large, it becomes difficult to easily download or store. As a result, blocks are currently limited in size, which limits the maximum number of transactions per second we can make to 7–far less than the volume handled by Visa or MasterCard.
This has become a major point of criticism against Bitcoin, especially with the arrival of “Bitcoin 2.0″ applications unrelated to cryptocurrency that want to use the blockchain. Despite all the FUD, however, there are a number of solutions in sight that make it a trivial concern. In this article, I intend to put the issue to rest.
What Gavin Andresen has basically proposed is that we increase the block size, and thus the maximum number of transactions per second. You might interject that this is impossible due to the aforementioned technical difficulties, but the plan is to do this gradually by 50% per year. Computer and Internet technology grows at an exponential rate, so scientists and engineers should be able to keep pace.
Although the blockchain is now over 20 GB in size and growing at an increasing rate, because the block size is capped at 1 MB (and with 1 block approximately every ten minutes), this means that the blockchain can grow by at most 525,600 / 10 = 52,560 MB or ~ 52.5 GB per year (discounting leap year). If blocks increase in size by 50% per year, then the blockchain could grow by over 52.5 * 1.5^10 ~ 3 TB in 10 years.
By comparison, hard disk drives in excess of 1 TB are now relatively abundant, and by the time Bitcoin has enough users and 2.0 applications to reach maximum growth, hard drives measuring in the dozens of terabytes will be commonplace. And with the advent of creative blockchain-based cloud storage applications like Storj, we can expand those limits by allowing everyone to store pieces of the blockchain and access the others when needed through the network.
The question now is how to move all of that data. Hard disk drives have a maximum data transfer speed limited by the rotational speed of the disk, but solid-state drives are based on flash memory or can even use DRAM, which is about as fast as the RAM your desktop uses to power its applications. The last possible bottleneck lies in your Internet connection, which these days is limited both in bandwidth and total data downloaded per month.
Thankfully–as those who’ve read my previous article on How to Decentralize the Internet already know–the ‘net is continuously advancing. Although pesky telecommunications monopolies hinder the growth of our Internet connection speeds, Wi-Fi routers will soon be able to connect directly to each other. The next wireless standard–known as 802.11ac–uses a long range, directed 5 GHz frequency, and can carry around 40 MB/s. This translates to 40 * 60 * 10 = 24,000 MB ~ 24 GB every 10 minutes, which is way bigger than any Bitcoin block on the blockchain.
Emerging networking technology is obviously sufficient to sustain the Bitcoin network, but the Honey Badger doesn’t settle for sufficiency. Bitcoin’s competitors accuse it of wasting energy and other resources, but unlike most banks, it is always growing more efficient. With clever use of cryptography, we can limit the size of the blockchain without losing any of the information it contains.
That’s what Factom does in a nutshell. It takes “blockchain pollution”–a pejorative term used by some in the Bitcoin community to denote non-currency or other 2.0 data inserted into transactions–and arranges the various files into a column. These entries are then hashed and inserted into entry blocks, where they become the “leaves” of what are called Merkle trees. The leaves are hashed together 2 at a time in stages (like a bracket tournament) until they coalesce at a hash called the Merkle root.
The data for every decentralized application using the Factom protocol is grouped into its own entry block which produces its own Merkle root, all of which are grouped together into what’s called a Factom block. The Merkle root of each decentralized application becomes a leaf in a new tree, and are again hashed together in stages to produce a final Merkle root. Real applications need to move faster than Bitcoin’s 10-minute block time can allow, so these roots build up as they are calculated and then inserted into a Bitcoin block all at once.
Each Factom block provides a meta-snapshot of all decentralized applications and organizations in time, capable of verifying who owned or owed what, and any other information people might have motivation to tamper with or disagree over. If we acquire and arrange all of the entry blocks corresponding to one application into what is called a Factom chain, we run through the hashed data of that application going backwards in time every (approximately) 1 minute. If someone forged a fake entry block to fool you out of your smart property, it would fail to hash to the correct Factom root recorded by Bitcoin miners, which could be mathematically proven with ease.
So, who creates and stores these entry blocks, and how can we access them? Although the Bitcoin blockchain can be used to verify the validity of smart contracts and the owners of smart property, the process of compressing application data to eliminate bloat causes us to lose specific information about the users and apps–the who and the what.
As one should hope, the solution to that is decentralized. Factom is essentially a P2P network using a BitTorrent-like protocol, running on a system of federated servers using a BitTorrent-like protocol. These servers all audit one another continuously to make sure they’re following Factom’s rules, and all of them are compensated in cryptocurrency for their services. In the future, they will use networks like Storj and SAFE to communicate via the meshnet.
Once that’s established, anyone with the proper client program on their computer device can get and verify the information they need to run a decentralized application. You just need to be connected to the Internet, and–as it continues to decentralize–you’ll soon be able to do everything you need without worrying about silly things like “blockchain bloat.”