The Long History and Disputed Desirability of Alternative Bitcoin Implementations
Coinciding with the opening of the Ethereum Foundation’s Devcon2 conference in Shanghai this week, significant parts of the Ethereum network were crashing. In what seemed like an attack using an intentionally difficult-to-execute smart contract, the much used Ethereum implementation Geth stopped functioning altogether, while another implementation — Parity — was also having issues. Upon discovering the attack, Ethereum developers wrote a hotfix for Geth within a couple of hours that seemed to solve the problem, at least temporarily.*Interestingly, while the fix was happening, alternative Ethereum implementations (other than Geth and Parity) were doing fine. Save for a drop in hash rate due to miners running Geth, the Ethereum network kept running smoothly for most users of these different clients. This lack of interruption was taken by some as a vindication of Ethereum’s use of multiple interoperable software implementations, rather than a single client.
The “Reference Implementation”
The question whether alternative software implementations for Bitcoin are desirable has been discussed for years. These implementations, or clients, are essentially computer programs that connect to, and therefore become part of, the network. The debates surrounding their role date from the early days of Bitcoin’s history, back when the community mostly consisted of tinkering techies.
The first Bitcoin implementation was of course Satoshi Nakamoto’s version of Bitcoin, written in the coding language C++. This client later become known as Bitcoin-Qt, and now Bitcoin Core; it is sometimes also referred to as Bitcoin’s “reference client” or the “Satoshi client.” For a while, this was the only Bitcoin implementation — although over time Satoshi released updates; i.e., slightly different versions of the same client.
Satoshi Nakamoto himself believed it was best to stick to only one Bitcoin implementation. He thought alternative implementations might process data differently, thus posing a significant risk that they would run out of sync with each other. Satoshi warned this would undermine a key property of Bitcoin: the ability of all users to reach consensus over the state of the blockchain ledger.
In 2010, debating Gavin Andresen on Bitcointalk, Satoshi argued,
“I don't believe a second, compatible implementation of Bitcoin will ever be a good idea. So much of the design depends on all nodes getting exactly identical results in lockstep that a second implementation would be a menace to the network.”
Gavin Andresen, who would later succeed Satoshi Nakamoto as lead developer of Bitcoin’s reference client, replied that the desirability of alternative implementations was irrelevant. Andresen believed that these would be inevitable — whether Satoshi liked it or not.
“Good idea or not, SOMEBODY will try to mess up the network (or co-opt it for their own use) sooner or later. They'll either hack the existing code or write their own version, and will be a menace to the network.”
The First Alternatives
Whether or not Satoshi Nakamoto liked alternative Bitcoin implementations indeed proved to be irrelevant. The first alt-clients sprang up shortly after Bitcoin’s inventor disappeared from the community.
This trend started with libbitcoin, first announced in 2011. A project led by Amir Taaki, libbitcoin was designed to provide an alternative to the Satoshi client in order to decentralize control over Bitcoin and increase the network’s robustness when faced with attacks.
Taaki explained his motivation in The Libbitcoin Manifesto, writing:
“A diversified Bitcoin of many wallets and implementations is a strong and pure Bitcoin. To protect the integrity of the network, we need to eliminate single points of failure. An inbred Bitcoin with the same software code everywhere shares the same weaknesses, and is susceptible to the same attacks. A single pathogen can wipe out a genetically homogenous population. And centralized software is vulnerable to the dictates of whoever controls development of that software code, and any dictates pressured onto them.”
Not much later, more alternative implementations saw the light of day. Mike Hearn’s bitcoinj was the first Bitcoin client written in a different coding language (Java), while Jeff Garzik followed with picocoin and Tamás Blummer launched Bits of Proof.
In 2013 Btcd, a Bitcoin implementation written in Google’s coding language “Go” was introduced. Btcd’s launch was covered by Bitcoin Magazine in an article written by then-Bitcoin Magazine writer Vitalik Buterin.
Echoing Taaki’s plea for a more diverse ecosystem, Buterin wrote:
“[T]he deeper into the protocol one goes, the more it becomes a monoculture; but monocultures are dangerous. If there is only one implementation being widely used, then unforeseen bugs appearing (or even disappearing) in upgrades can cause the entire Bitcoin blockchain to essentially fork into two as the two versions of the protocol disagree on which transactions and blocks are valid and which are not.”
By this time, Andresen had taken the reins of Bitcoin Core as its lead developer. Even more so than when debating Satoshi Nakamoto three years prior, Andresen had come to believe that multiple implementations would strengthen the Bitcoin ecosystem.
In a blog post for the Bitcoin Foundation, he wrote:
“Diversity is a good thing. Diverse, interoperating implementations of the Bitcoin protocol make the network more robust against software bugs, denial-of-service attacks and vulnerabilities.”
Since, an increasing number of Bitcoin implementations have connected to the Bitcoin network, most written in their own programming language.
The increasing number of Bitcoin implementations may seem like a success of the multiple-client approach. However, much of the Bitcoin network is still dominated by the “reference client” — Bitcoin Core and versions thereof. To date, complete re-implementations seem to have gained relatively little traction among users, companies, and especially miners.
In 2015, addressing the libbitcoin development team on the Bitcoin development mailing list, Bitcoin Core developer Peter Todd explained why he believes this to be the case. Simply put, users, companies and miners require software that follows the Bitcoin protocol. And the Bitcoin protocol, Todd argued, is defined by (some of) the code as implemented in the Satoshi client. Any other code may — unintentionally — follow a different protocol, even if that’s not noticeable yet.
“The consensus critical Satoshi-derived sourcecode is a protocol *specification* that happens to also be machine readable and executable,” Todd wrote. “By reimplementing consensus code — rewriting the protocol spec — you drop out of the political process that is Bitcoin development. You're not decentralizing Bitcoin at all — you're contributing to its centralization by not participating, leaving behind a smaller and more centralized development process. Fact is, what you've implemented in libbitcoin just isn't the Bitcoin protocol and isn't going to get adopted by miners nor used by serious merchants and exchanges — the sources of real political power.”
Writing to a different development mailing list, Todd took this logic to mean that even bugs in the Satoshi-derived source code should be considered part of the protocol — meaning any “bug free” alternative software implementation is, in that case, not running the same protocol. For alternative implementations to really run the Bitcoin protocol, they must be “bug-for-bug compatible.”
Instead of fully reimplementing a code base, Todd therefore argued developers should simply fork Bitcoin Core, and tweak that code base to fit their needs. Todd himself did exactly that for his Replace-by-fee fork, while Bitcoin Core developers BtcDrak and Luke Dashjr similarly maintain the Bitcoin Core forks Bitcoin Addrindex and Bitcoin Knots. (And over the past year, a comparable trend emerged as developers wanting to increase Bitcoin’s block size limit launched Bitcoin XT, Bitcoin Classic and Bitcoin Unlimited — though these forks were actually designed to split off to a new protocol under certain conditions.)
Of course, Todd’s argument was itself criticized.
For example, while acknowledging that alternative implementations do risk forking off to their own network, Btcd developer Dave Collins pointed out that the Bitcoin network already consists of many different software versions, including the many different versions of the Satoshi client. Importantly, these different versions of the same client can fork off to different networks just as well, and indeed have done so in the past.
As such, Collins argued, there is no fundamental distinction between different versions of the same client and alternative clients. From his conformal blog:
“There is currently no way to guarantee that any two versions of Bitcoin software, whether they are two different versions of Bitcoin Core, two different versions of alternative implementations, a version of Bitcoin Core versus a version of an alternative implementation, or even two copies of the same version of Bitcoin Core built with different compiler versions, are in exact consensus agreement. Doing so is incredibly difficult and borders on impossible. The issue is implementation independent.”
Libbitcoin is today led by Eric Voskuil. Unsurprisingly, Voskuil agrees with Collins. And while Voskuil also acknowledges Todd’s position that bugs are part of the consensus encoding of an implementation, he argues this means there should not be one particular implementation to define the Bitcoin protocol.
“All code that impacts consensus is part of consensus,” Voskuil told Bitcoin Magazine. “But when part of this code stops the network or does something not nice, it's called a bug needing a fix, but that fix is a change to consensus. Since bugs are consensus, fixes are forks. As such, a single implementation gives far too much power to its developers. Shutting down the network while some star chamber works out a new consensus is downright authoritarian.”
This week’s failure of Ethereum’s Geth nodes perhaps presents the first clear real-world example of one set of software implementations crashing, while alternatives — and therefore the network itself — was able to keep running.
Of course, this diversity within Ethereum’s ecosystem for a large part resulted from Ethereum founder Vitalik Buterin’s vision — the same Buterin who, as writer for Bitcoin Magazine, argued in favor of a more diverse Bitcoin ecosystem. From the start, Ethereum launched with several different clients, rather than one specific reference implementation.
Not everyone agrees it was desirable for the Ethereum network to keep running while Geth nodes were crashing, however. Peter Todd, indeed, maintains it would have been better for all nodes on the network to have behaved identically — even if that means they’d have all crashed.
Speaking to Bitcoin Magazine, Todd explained:
“Basically the trade-offs are very simple. Having multiple implementations prioritizes availability of the network, but using the network while the issue was being fixed was a pretty dangerous thing to do. The Parity nodes weren't propagating blocks at normal speeds, increasing orphan rates and making it more likely to have a false confirmation. The lower hashing power than normal made 51 percent attacks more of a risk. The Geth fix could have been botched. And in general, during the event whether or not any of that happened was unknown. Safest is for everything to shut down if something goes wrong, which in this case would have only been a few hours of downtime — not a big deal.”
And Todd believes the situation could have been worse if Geth nodes hadn’t shut down, but would have instead confirmed or rejected different transactions and blocks.
“Geth could have easily split off to another chain, in which case the problem would have been much worse. In that case, it's not clear which one is actually the right chain,” he said.
Of course, this is where libbitcoin’s Eric Voskuil, disagrees. Speaking to Bitcoin Magazine, Voskuil said he believes that Todd is approaching the problem from the wrong perspective. Rather than a software implementation defining the protocol, Voskuil instead says that those who actually conduct in trade should do this work.
“There is no “right chain” — just those that people choose to use,” Voskuil said. “If the One True Implementation defines consensus, and it fails, what is the consensus? The fact that people on the Ethereum network kept using other implementations meant that developers writing the “fix” to Geth couldn’t redefine consensus, but needed to conform to the actual consensus.”
Moving forward, there are several projects in the works that may have the potential to help the Bitcoin ecosystem become even more diversified — perhaps even without risking blockchain-splits. At least, that’s what some believe.
Peter Todd pointed out that formal proofs could be of help in the future. Explaining the concept to Bitcoin Magazine, he said:
“Basically, formal proofs have math to prove that code does what you think it does. Or at least, that code has a certain property. This can be used to verify that different implementations will really follow the same protocol. This is not a huge stretch; formal proofs are already used in Bitcoin to prove that parts of the libsecp256k1 library are correct.”
Another promising project may be libconsensus, a software library derived from the Bitcoin Core code base. An effort by Bitcoin Core developers that started in 2014, libconsensus should enable alternative implementations to easily adopt the code required to remain in consensus with the rest of the network.
Bitcoin Core and Blockstream developer Jorge Timón has been one of the main advocates of and most active contributors to libconsensus. Speaking to Bitcoin Magazine, Timón explained that since “Bitcoin Core” is the implementation currently in practice, the notion that “the implemation is the specification” is actually problematic.
“That is unfair to other implementations, in certain sense,” said Timon. “They are warned against reimplementing consensus validation, but no solution is given to them besides’“run your things behind a Bitcoin Core node.’ So we’re separating enough code from Bitcoin Core to fully verify a block — and nothing else. This can be used by alternative implementations, to work from there.”
Libbitcoin’s Voskuil, however, remains skeptical that libconsensus is really needed to diversify Bitcoin’s ecosystem.
“Libconsensus is an honest attempt to help create a more diverse community, and libbitcoin supports it as an option,” he told Bitcoin Magazine. “But it will not survive as a long-term solution. It's unnecessary, complicates development, and does not currently cover anything but script validation. If it were to expand to cover everything that might result in a fork, it would be most of the implementation of a node. We could dump our own script code in favor of libconsensus, but as libconsensus expands to include all impacts on consensus, what would we be left with? It's a camel's nose under the tent.”
“In the end, all this is really a moot point. Other implementations exist and are running on the network. This will not stop, it will only increase. The idea that consensus rules cannot be implemented as reliably in multiple implementations, across multiple versions of one implementation, is not only absurd, it's irrelevant.”
*This story was written before new attacks troubled the Ethereum network on Thursday. That story is still developing.