The Battle For P2SH: The Untold Story Of The First Bitcoin War
“Push the date back two months. OP_EVAL just is not ready yet.”
It was the verdict Gavin Andresen had worked so long to avoid. With a single rebuke sent from Russell O’Connor’s keyboard, a months-long effort to upgrade Bitcoin — the first in the wake of founder Satoshi Nakamoto’s exit — was abruptly stalled ahead of implementation.
As revealed by O’Connor, the proposed command — heralded by Andresen as the “fastest path” to more secure Bitcoin wallets — could be exploited to create transactions that would send the software into an infinite computational loop in an attempt to validate them.
In short, OP_EVAL could be abused to crash Bitcoin nodes, and thus the Bitcoin network.
“It took me all of 70 minutes of looking to find this bug,” O’Connor wrote, condemning a process that had merged — and nearly pushed — bad code into the live software. “You guys need to stop what you are doing and really understand Bitcoin.”
It was the first serious setback for Andresen, the project’s new lead, who was quick to protest. In his view, abandoning OP_EVAL wouldn’t just waste months of coding and review, it would leave users without tools to protect against the trojans and viruses then plundering their digital wallets.
This was at the heart of OP_EVAL’s appeal — easy multisignature wallets would allow users to recover bitcoin, even when backups were lost; services might be built to send bank-like alerts, deterring fraud and theft; and better still, this could all be achieved in transactions that would look and behave like those users knew and understood.
But O’Connor’s words of warning were enough for those who had seen their concerns about the escalating pace of development validated.
“I would like to remind everyone that we are messing with a $20+ million dollar thing,” developer Alan Reiner would write. “There's more than just a piece of software at stake — whatever goes in needs to be as hard as diamond.”
The failure of OP_EVAL would yet have bigger implications. It was true Nakamoto had launched the world’s first decentralized digital currency, but its promise was far from fulfilled. Few in late 2011 understood its code, and fewer still possessed the skill and familiarity to safeguard it.
How should these developers organize? What responsibilities did they have to users? And how would they enact change when it wasn’t clear who — if anyone — should have the final say?
Such questions would soon be thrust to the fore in the first great battle over the Bitcoin software.
An Unorthodox Succession
Free and open-source projects are most often led by founders, who in turn must align efforts with contributors on whom their work depends. Still, where disputes of direction arise, they are imbued with a natural authority to act as decision-makers for their creations.
Bitcoin, early on, was no exception. For the first two years of its existence, Nakamoto played the role of lead developer and benevolent dictator. As Bitcoin’s undisputed leader, he enacted as many as eight protocol changes without much resembling a wider discourse. That is, until he gradually stepped away from the project.
By the end of 2010, Nakamoto would erase their pseudonym from the Bitcoin.org website, leaving veteran 3-D graphics developer Gavin Andresen to claim the mantle as the project's "de-facto lead”.
Andresen's preferred choice of words was appropriate, as the circumstances surrounding this transition were unusual, amounting to a brief public message, a private passing of duties and the exchange of a key allowing the user to send a system-wide alert message.
Still, at the time, this posed few difficulties for Bitcoin’s small but growing group of coders. Most were concerned about critical fixes, and Andresen, the spouse of a tenured professor, had the time and enthusiasm to lead the work.
Indeed, there were many pressing needs — faster syncing, better testing — but the "increased reports of stolen wallets" and the “bad PR” the thefts caused quickly emerged as a top concern.
For a time, it was a goal on which Bitcoin’s new band of contributors all seemed to agree.
Luckily, the blueprint of a solution had been provided by Nakamoto. As Andresen would learn, Bitcoin’s code already enabled users to create secure transactions that could only be spent when signed with multiple private keys.
With multisignature, or multisig for short, private keys could be stored on multiple devices, on opposite ends of the world, or shared between a user and a wallet service, meaning hackers would have to compromise multiple targets to steal coins.
Enamored with the idea, Andresen would become its first champion, penning an impassioned plea on the mailing list to inspire contributors to action.
"My biggest worry is we'll say, ‘Sure, it'll only take a couple days to agree on how to do it right,’ and six months from now there is still no consensus,” he wrote. “And people's wallets [will] continue to get lost or stolen.”
The worries weren’t without weight — as implemented by Nakamoto, multisig had significant drawbacks. The most pressing of these was that the transactions were incompatible with Bitcoin’s standard address format, and instead, required much longer addresses.
Because of this, transactions funding multisig wallets were bigger and required higher fees. What’s more, these fees had to be paid not by the person receiving bitcoin with the multisig wallet, but by the person sending bitcoin to them.
Due to these suboptimal properties, multisig transactions were designated as “non-standard” in the software, meaning they wouldn't necessarily propagate to nodes on the network. If a node did receive a multisig transaction, it would simply ignore it. Similarly, there was no guarantee miners would include these transactions in blocks.
If they were included, nodes would accept them (multisig transactions were ultimately valid). But in practice, the designation made it all but impossible to get these transactions confirmed.
To unlock the potential he saw, Andresen would go on to champion a new “op-code,” a type of command that nodes could use to decide if and when new types of transactions should be valid.
Designed to accommodate more advanced transactions like multisig, OP_EVAL leaned heavily on hashes, the cryptographic trick that scrambles and compresses data deterministically, but irreversibly into a unique string of numbers.
First proposed by the pseudonymous developer ByteCoin, the basic idea was that users could hash instructions detailing the conditions under which bitcoin could later be spent (including to and from multisig wallets) by including this hash in a transaction. Coins would essentially be sent “to” a hash.
The conditions required to later spend the bitcoin would only be revealed when the coins were spent “from” the hash. A multisig user would pay for the added transaction size when she spent the coins, while the extra data required posed a smaller burden on the network.
As the proposal received positive feedback, Andresen didn’t waste any time, preferring to get OP_EVAL deployed sooner rather than later.
"Security is really high on the priority list; I'd like to see secured Bitcoin addresses in people's forum signatures within a year,” he wrote.
Not everyone shared Andresen’s sense of urgency, however. OP_EVAL would be a big upgrade on a live system already carrying millions of dollars in value. Across the ocean from Andresen, a young Amir Taaki suggested developers take time to review the proposal.
“It seems good at first glance,” Taaki wrote. “But fast-tracking this into the blockchain is probably not a wise idea… Bitcoin is not exploding tomorrow, so there's no big loss from holding off on momentous changes like these.”
Further complicating matters, developers assumed adding OP_EVAL to the protocol would pose a significant coordination challenge. In essence, enacting it would require risking that the blockchain, the definitive record of all Bitcoin transactions, enforced by the shared consensus on its software rules, might split into incompatible networks.
This meant that as soon as OP_EVAL went live, every user would have to switch over to a new version of the software, and a new blockchain, in what was called a “hard fork” upgrade.
Fail to upgrade in unison, and miners might unknowingly produce “invalid” blocks. Even worse, users might unknowingly accept “invalid” transactions.
A New Kind Of Soft Fork
Soon enough, however, Andresen realized it was possible to assuage his detractors.
As a nifty trick, he uncovered that OP_EVAL could be deployed by redefining one of several inactive op-codes originally included by Nakamoto as placeholders for future commands.
To the surprise of everyone, including Andresen, this would also be compatible with nodes that didn’t upgrade to accept OP_EVAL. These nodes would check that the hash matched the new instructions, but wouldn’t enforce them, instead accepting the transactions by default.
As long as a majority of miners enforced the new rules, this meant the new blockchain would be considered valid by both upgraded and non-upgraded nodes. Upgraded nodes would accept the blockchain because the new rules were being enforced, while nodes that failed to upgrade would accept the blockchain because they didn’t care about the new rules either way.
Such backwards-compatible upgrades, or “soft forks,” had already been deployed by Nakamoto, but as the network had grown in size, developers had begun to worry about the sheer number of people who would need to be involved in any upgrade.
Unsurprisingly, Andresen’s realization that this could be avoided was welcomed by other established contributors, with whom he quickly shared the news.
“Wow. Gavin’s point that [OP_EVAL] can be done without a split blew my mind,” Gregory Maxwell remarked, reacting to the discovery in real time. “Bring out the [sic] champaign.”
With this, developers went on to devise an even more secure method for activating soft forks. They theorized they could conduct something like a poll to determine when a feature had broad enough support from miners, which they could then use to ensure a safe upgrade.
Miners would be asked to include a bit of data in the blocks they mined to signal that they would enforce the new rules. When a majority were ready, the change could be activated.
The Fatal Flaw
But all this work was undone by O’Connor’s findings.
The result was a split into factions, with some holding that OP_EVAL was being unnecessarily delayed and others arguing the quick fixes proposed would impair certain desired properties of Bitcoin's essential scripting language.
Developers including Luke Dashjr, Pieter Wuille and Maxwell suggested alternatives which, like OP_EVAL, utilized the concept of sending coins “to” a hash. But the challenge was still to get this logic, which they started referred to as “pay to script hash" or "P2SH," into Bitcoin as a soft fork and avoid a blockchain split.
Existing op-codes could only go so far: non-upgraded nodes would need to accept transactions that spent coins from hashes, without understanding the new rules.
It was Andresen who found a path forward, and his specific P2SH solution wouldn’t require a new op-code at all. Rather, Andresen's idea was that Bitcoin could be programmed to recognize a certain format of transactions, and then interpret this format in an unconventional way to validate it using new instructions.
Any node that didn't upgrade would interpret the unconventional format using conventional logic. Like with OP_EVAL, the transaction would always be considered valid by non-upgraded nodes. This meant that P2SH could be deployed as a soft fork: so long as a majority of hash power enforced the new rules, both old and new nodes would agree on the same blockchain.
Andresen’s proposal appeared satisfactory to most. “Seems ... acceptable from first glance," O’Connor responded. Taaki, referring to the code’s unconventional approach, said: "The idea is a hack.... but I like it."
At a subsequent developer meeting the sentiment held, and attendees agreed to implement Andresen's P2SH proposal. Miners would be polled in the week leading up to February 1, and if a majority of hash power (55 percent) signalled support, a client would be released to activate the soft fork just two weeks later.
The peace would last all of a few days.
Why Not Use USD?
Breaking the consensus would be Dashjr, who had had to leave the meeting early and only later learned Andresen's version of P2SH had been the accepted compromise.
The unconventional nature of Andresen’s solution irked Dashjr, who believed it complicated the protocol and brought uncertain consequences down the line. He raised the issue with Andresen, but the latter was unconvinced his concerns merited a change of plans.
His suggestions spurned, Dashjr would erupt on the public BitcoinTalk forum in mid-January, denouncing P2SH and charging Andresen was “on his own” in supporting the change.
“Gavin is forcing everyone using the latest Bitcoin code to vote for [P2SH],” he wrote. “If you want to oppose this insane protocol change, you will need to modify your BitcoinD source code or you will be voting IN FAVOUR OF IT by default.”
Due to the nuance of his objections, the brash tenor in which they were delivered and his accusations about Andresen, responses to the post were less than positive. Instead of limiting the technical debate to developers, some perceived Dashjr as trying to incite a popular mob.
It didn’t help that Dashjr was one of the project’s more quixotic contributors, known for his long arguments in defense of alternative number systems and strong Christian faith. One forum user said Dashjr’s comments made him look "mentally unstable." Another said he didn't want to bother with the specifics at all; he simply trusted Andresen.
In response, Dashjr launched a sustained objection to the P2SH proposal on philosophical grounds, disputing not just its technical merits but its implications for governance.
"If you want a monarchial currency, why not just use the Fed's USD?" Dashjr asked his detractors, only to be hounded by others claiming it was he who was vying for power.
Not backing down, Dashjr would code an alternative version of P2SH, called CheckHashVerify (CHV). CHV was essentially a different P2SH implementation — but it didn’t require an unconventional interpretation of transaction outputs. Instead, CHV added a new op-code that, like OP_EVAL, could be “disguised” as a placeholder op-code.
But for Andresen, it was too late for more debate. Fuming over the public outburst, he responded with his own, writing:
"Luke, you try my patience. I'm going to step away from the code for a few days to calm down before I do something stupid."
Genjix Goes Public
As Andresen's P2SH design (now referred to simply as P2SH) was largely seen as a good-enough solution preferred by the project's lead developer, Dashjr found himself with few defenders.
It would fall on Taaki to be the minority voice to take fringe concerns seriously — but not because he opposed Andresen’s solution or necessarily agreed with Dashjr’s.
The developer, then in his early 20s, was already one of Bitcoin's most outspoken contributors, and while he had yet to become the headline-grabbing anarchist who hacked from squats and travelled with 3D-printed gun-runners, his vision for the software as an anti-establishment movement had already pushed him out of the project’s inner circle.
This, in turn, had made Taaki distrustful of the project’s accelerating development process. He preferred it if the decision-making process took time and involved the broader user base.
In his view, Bitcoin wasn’t served well by a small cabal of developers calling the shots. Taaki strongly felt that anyone with an interest in the project should be aware of the trade-offs, and insofar as possible, participate in decision making.
“I'd rather people have a say in the matter even if it makes life tougher for developers to explain their decisions,” he told other developers. “I feel a bit apprehensive about telling our users this is how it will be, you have no say and then giving them the finger.”
Even if Taaki agreed that the difference between Andresen's P2SH and Dashjr's CHV proposals was small, he persisted that getting users involved in the development process was an important exercise.
“[M]y worry is someday Bitcoin becomes corrupted. See this extra scrutiny as an opportunity to build a culture of openness,” he argued.
To this effect, Taaki wrote a blog post in which he laid out the P2SH and CHV upgrades and the differences between the two.
Users had a choice, was Taaki’s message, and: “Voting is based on mining power.”
A F*cked-Up Situation
With his choice of words, Taaki had outed an elephant in the room. It was true, Nakamoto had enacted soft forks, but by late 2011, the network no longer operated as it did in those early days.
When Nakamoto published the white paper in 2008, he assumed proof-of-work would be supplied by users contributing computations via personal computers. “Proof-of-work is essentially one-CPU-one-vote,” Nakamoto had written.
Under this design, any user could be a miner and secure the network by proposing blocks, validating transactions sent by peers and enforcing the code authored by developers.
But in the years since the software’s launch, this model had been obsoleted by entrepreneurs. Since Lazlo Hanyesz (of Bitcoin pizza fame) had figured out how to generate bitcoin with more powerful graphics processing units, specialists had been busy turning mining from a hobby into a small enterprise.
Around the same time, Marek “Slush” Palatinus introduced a method to allow miners to pool the hash power needed to propose blocks and share the profits. This effectively made mining less of a lottery, and more of a stable source of income.
By late 2011, just three pools — DeepBit, Slush Pool and BTC Guild — controlled well over half of global hash power. Instead of one-CPU-one-vote, most of the “votes” were now concentrated in just a few mining pool operators, as if they were representatives for their cyber-constituents.
To some, it was proof that something was wrong on the Bitcoin network. “I see [a mining pool] deciding a change in the network as a farce of a vote,” early miner Midnightmagic argued.
To others, mining centralization was an unfortunate crutch, a way to make a soft fork upgrade more manageable, and therefore less risky. (After all, a safe rollout now required the participation of just a handful of mining pool operators.)
Maxwell, for example, was more resigned to an unsatisfactory reality at hand.
“If there was non-trivial pushback both the devs and pools would back off, but no one seems much opposed to it now in any case,” he replied. “[I]ts a good mechanism to use for the future… when hopefully we won't have this fucked up situation where Bitcoin is no longer decentralized.”
To Vote Or Not To Vote
That Andresen and Dashjr’s warring proposals would come to embody opposing views on Bitcoin governance would only complicate matters.
Up until then, developers had always spoken about the upcoming soft fork upgrade as a kind of vote: miners could enforce the new rules outlined by P2SH (or OP_EVAL) with a hash power majority, so a vote was meant to gauge the likelihood of this outcome.
But while the terminology had become part of the lexicon, this omitted some technical nuance. In conducting a poll, developers weren’t exactly asking miners what they thought of the new rules. Rather, they saw this as a way to see if miners were ready to ensure a safe upgrade.
From that perspective, it made sense to developers that only one proposal would be added to the software users and miners would run to enforce the network rules.
“The Bitcoin system is _NOT_ up for a majority election. Not a majority of hashpower, not a majority of people, not a majority of money,” Maxwell argued, annoyed by Taaki’s framing of the decision as a vote.
Maxwell felt strongly miner “votes” should be limited as they were in the software itself, to enforcing the order of transactions — not the rules of the entire network.
“What happens if a super-majority – even 100% – of the current miners decide that the subsidy should be 50 BTC forever? NOTHING. Miners who change that rule in their software simply stop existing from the perspective of the Bitcoin network,” he wrote.
Dashjr didn’t disagree with Maxwell, but in practice it was hard for him to see how Bitcoin would remain secure should developers push changes without miner support.
“Miners can simply refuse to mine P2SH transactions to be immune to the ‘development team's changes,’” he responded. “If the ‘developers’ lock out all the miners, guess what happens? Easy 50% attacks, the network is left unsecured!”
Seen in this light, it’s easier to understand why Dashjr believed Andresen was abusing his role as lead developer by attempting to push P2SH alone. If a miner used the standard software to mine a block, it would cast a “vote” in favor of P2SH automatically.
In response, Dashjr wrote patches that would enter his preferred proposal into the hash power “election,” introducing the option for miners to vote both for and against P2SH and CHV.
Although few miners used the code, Dashjr’s opposition had an effect. Tycho, the operator of DeepBit, then the network’s biggest mining pool, began to grow uncomfortable with his role in evaluating the competing code.
Arguing it was clear no consensus among developers had yet been reached, he wrote: “I don't want to become the single entity to decide on this.”
In rejecting the idea a mining pool could, even as a convenience, be used to sway an upgrade decision, Tycho added another twist to the debate at hand. Without his support, amounting to over 30 percent of all hash power, P2SH would have a difficult time being activated.
By late January, the first P2SH voting round was drawing to a close, and it didn’t look like it was going to meet its required threshold. The upgrade would have to be delayed, a reality that frustrated not just Andresen, but other developers as well.
On IRC, Maxwell publicly lamented that there appeared no end in sight to the deadlock.
“This 'hurry' meme is bullshit, Gavin started on the [pay-to-script-hash] route in, what, October?” he wrote. “As far as I can tell, unless someone draws a deadline this process will never converge because there will always be some NEXT guy who's great idea was left out.”
Andresen would lay the blame for the delay not on the advent of mining pools, but on DeepBit’s operator Tycho personally. “Right now, it looks like one person has enough hashing power to veto any change,” he wrote.
This bothered Andresen, who saw Tycho’s stance as unethical. "I think it is wrong of you to use your position as the biggest pool operator to go against the general consensus," he wrote.
Indeed, even when Andresen went so far as to apply public pressure, pushing users to ask their mining pools to upgrade – and offering to reimburse all of DeepBit’s funds in the event P2SH led to any financial loss – Tycho was unwilling to “vote” for the proposal.
Faced with the delay, Andresen made an attempt to marshal the public to the cause, persisting in his conviction the choice between P2SH and CHV would have little impact on users.
“All of the [P2SH/CHV] stuff is mostly engineers arguing over whether it is better to use a nail, a screw or glue to put two pieces of wood together. Any of the solutions would work, and ordinary users wouldn't notice any difference.”
Judging by the responses in the thread, Bitcoin users accepted Andresen’s frame, blaming Tycho for holding back the fork and pressuring him to activate.
Tycho, in turn, fiercely objected to Andresen’s assertion. Even with 30 percent of hash power, he knew the remaining miners could overrule him, and he didn’t want to be the deciding factor.
With P2SH having failed so far to accumulate sufficient hash power support, Andresen would be increasingly forced to discuss strategy for his proposal in the open, and he notably began accepting CHV as a potential alternative to break the deadlock.
Still, responses drew a dividing line between those who believed the choice between P2SH and CHV was for miners to make, and those who favored a more meritocratic decision-making.
“Ultimately, miners are the ONLY people who have any say over issues like this,” BitcoinTalk user dooglus argued. “They're the only ones who decide which transactions get into blocks.”
The forum’s administrator, Theymos, rejected this idea outright. “Non-miners can reject blocks. If enough clients do this, the coins miners mine will become worthless.”
Instead, Theymos proposed that a certain inner circle of experts should engage in a two-week discussion and issue a vote at the end. Either because of the suggestion or happenstance, Dashjr soon created a Wiki where a roster of respected developers could voice their preference.
Over the next few days, Maxwell, Thomas and Wuille all indicated they’d be happy to accept either P2SH or CHV, though they made clear they preferred P2SH. O’Connor and Dashjr agreed that P2SH was acceptable, but voiced a preference for CHV.
Perhaps unsurprisingly, Andresen made sure to sway the ballot in favor of P2SH, registering a resounding "no" against the CHV proposal.
More importantly, perhaps, very few miners were supporting CHV. By mid-February, P2SH was supported by 30 percent of hash power, while Dashjr’s alternative was stuck around 2 percent.
During a meeting on IRC, Dashjr said he was considering whether to withdraw CHV altogether, begrudgingly accepting P2SH’s dominance. At that same meeting, attendees agreed to set a second voting deadline for March 1.
As the new deadline approached, more miners gathered behind P2SH, bringing hash power support close to the 55 percent threshold. Soon, both Tycho and Dashjr were left with no other choice but to accept their peers’ preferences.
With that, Andresen announced that the soft fork would be deployed and activated within 10 days, and by April 1, 2012, the new rules were enforced.
P2SH, the first protocol upgrade since Satoshi’s departure, had been enacted.
Tempest In A Teapot
The difficult political process that had led to the passage of P2SH would continue to have a lasting impact outside the software itself.
In the end, Andresen had been able to deploy the solution he both designed and favored. If it can be said that his leadership was questioned amid the crisis, by the end, it was firmly cemented.
Public opinion, unconcerned with specifics, largely coalesced against the actions of Dashjr, and to a lesser extent Taaki, deeming them unnecessary and inflammatory. Andresen went so far as to ask Dashjr to stop contributing to Bitcoin entirely, though it appears he either backed down from that threat or else Dashjr simply ignored it.
Meanwhile, Maxwell became one of Bitcoin's “core developers,” sharing commit access to the project with Andresen and contributors Wladimir van der Laan and Jeff Garzik.
The tone had been set: when it came to Bitcoin development, a supportive, pragmatic attitude was rewarded and contrarian contributors were dismissed. While ideological differences had surfaced, they remained – and were arguably only entrenched by – the proceedings.
With more users flocking to Bitcoin by the day, P2SH shortly passed into lore, though it would notably continue to serve as a flash point in disagreements among developers.
Recalling the events a year later in response to another emerging crisis, Andresen would boast in ways that suggest he believed P2SH validated his leadership and vision for the project.
“The block size will be raised,” he wrote, in response to a video produced by developer Peter Todd advocating against the limit increase in early 2013. “Your video will just make a lot of people worried about nothing, in exactly the same way Luke-Jr's [CHV] proposal last year did nothing but cause a tempest in a teapot.”
How should decisions be made for the first decentralized digital currency? If the question had finally been asked, it would take a wider war, still years in the future, to resolve it...