Segregated Witness, Part 1: How a Clever Hack Could Significantly Increase Bitcoin's Potential
If one proposal excited attendees at the recent Scaling Bitcoin workshop in Hong Kong, Bitcoin Core and Blockstream, it was developer Dr. Pieter Wuille's Segregated Witness. Praised by many within the technical community, Segregated Witness is expected to improve Bitcoin's performance in a number of ways, while some even hope it might be the scaling solution that helps bring some peace back to the Bitcoin community.
In this first part of Bitcoin Magazine’s three-part series on Segregated Witness: how it works.
What are Bitcoin transactions again?
In order to understand Segregated Witness, it is helpful to understand what Bitcoin transactions are on a more technical level. (Feel free to skip to the last section of this article if this is a familiar subject.)
For starters, it's important to realize that the Bitcoin protocol, at its core, consists of transactions. Nodes on the peer-to-peer network don't send each other bitcoin; they send each other packages that contain transaction data.
These Bitcoin transactions, in a way, are really sets of “locks.” More specifically, each transaction contains two main components. One half effectively unlocks bitcoin that were locked up in previous transactions, using pieces of data called inputs. Inputs include scripts, instructions on how to unlock the input, called scriptSigs . The other half consists of one or several new locks called outputs, which lock the same amount or less bitcoin up again. Outputs include scripts called scriptPubKeys. As such, bitcoin effectively move from inputs to outputs within a single transaction, and jump from transaction to transaction at the same time.
There is one main exception to this rule. A coinbase transaction (not to be confused by the wallet company and exchange with the same name) is the transaction created by a miner when he finds a new block, and contains the block reward: currently 25 bitcoin. Additionally, a miner can increase the coinbase reward by any amount of bitcoin that was unlocked in transactions, but not locked up again: the difference between the inputs and the outputs. These are the transaction fees.
All of this “unlocking” and “locking up” of bitcoin is done by senders of transactions, and subsequently transmitted over the Bitcoin network as packages of information. All nodes on the network, then, check whether this process of unlocking and locking up was done correctly. If everything checks out, they forward the transaction to other nodes. And if a node is also a miner, it might include the transaction in a block. Whether he actually does so is up to the miner, however; that's why it makes sense to include a fee.
It is of vital importance that the rules used by all nodes to verify transactions are compatible with the rules used by (almost) all miners. If some miners would include transactions in blocks that other nodes reject, the whole block would be considered invalid by that node. If that node is also a miner, this could lead to double spends and network forks.
These consensus rules – the rules all nodes agree on – allow transactions to lock up (and unlock) bitcoin in several different ways at once. But outputs that lock up bitcoin typically, at least, include a scriptPubKey along the lines of: “Prove that you own (or: know) the private key that corresponds to the public key that corresponds to this Bitcoin address.”
(It's easy to reproduce a public key from a private key, but it's basically impossible to reproduce a private key from a public key. Similarly, it's easy to reproduce a Bitcoin address from a public key, but it's impossible to reproduce a public key from a Bitcoin address. As such, it's also easy to reproduce a Bitcoin address from a private key, but impossible to reproduce a private key from a Bitcoin address. It's a “one way street.”)
The Bitcoin address used to lock bitcoin up in the scriptPubKey, of course, is the address provided by the receiving end of the transaction. Since the receiver created that address using a private key only he knows, he is the only one who can create a valid scriptSig, and therefore the only one who can create a new transaction and spend the locked up bitcoin.
Where do signatures come in?
To prove ownership of the private key that corresponds to the public key that corresponds to a Bitcoin address, one could theoretically include the private key in the scriptSig of a transaction. But, of course, that's not safe at all. Most importantly, anyone who sees the transaction could take the private key, and create a new transaction (or change the original transaction) to attribute himself as many bitcoin as are locked up before the original transaction is included in a block. In fact, if a miner did this, it would become a piece of cake to steal bitcoin, since he's the one picking which transactions to confirm.
Instead, therefore, the scriptPubKeys typically require the scriptSig to include one or more signatures in order to unlock bitcoin.
Signatures are a cryptographic trick which uses a private key in combination with any other data to calculate a unique string of numbers. And, using the magic of cryptography, the corresponding public key can be used to verify that the signature was created using that private key. As such, signatures prove both ownership of a private key, as well as approval of a specific piece of data by the owner of that private key, all without actually needing to reveal the private key.
In Bitcoin's case, private keys are typically used to sign the transaction data minus the inputs. (Hence, the scriptPubKeys, the locked amounts, and some additional details.) Subsequently, this signature and the public key from which bitcoin are spent are added to the input field of the transaction. This proves that the owner of the private key really intended to create the transaction and makes sure it cannot be tampered with.
Then, all of this transaction data – including the inputs this time – is hashed together, which creates the transaction ID, identifying the specific transaction. If a transaction is subsequently included in a block, the miner hashes the transaction ID together with another transaction ID to produce a new hash. And this hash is hashed again, this time along with the hash from two other transaction IDs. This process continues until there is only one hash left. This structure of hashes is called a Merkle Tree, and the remaining hash the Merkle Root . This Merkle Root is combined with additional block data to form the block header , which is used to identify the specific block. A hash of this block header, finally, must be included in the next block’s header, chaining blocks together.
Bitcoin is considered immutable because changing any part of any transaction retroactively would alter the transaction ID, in turn altering the block header. But this altered block header would no longer meet the proof of work requirement. And since the block header influences the make-up of subsequent block headers, neither would any of those.
What is a Segregated Witness?
The Segregated Witness proposal as presented by Wuille in Hong Kong is based on a concept used in Blockstream's sidechain Elements, and a complementing idea by Bitcoin Core developer Luke Dashjr. It was conceptualized over the past couple of months in cooperation with Bitcoin Core developers Gregory Maxwell and Eric Lombrozo, and might be rolled out over the next year.
As such, from the perspective of Bitcoin nodes that don't use Segregated Witness (lets call them “old nodes”), some newly created outputs might soon use a strange type of scriptPubKeys. Strange, because these scriptPubKeys can hardly be considered a lock at all. Commonly referred to as an “Anyone can spend,” these scriptPubKeys basically proclaim they don't require a signature. Additionally, they will include some meaningless text.
Old nodes will consider these transactions crazy. They will think that anyone can create a new scriptSig, unlocking these outputs, meaning they're highly insecure. But at the same time, old nodes won't mind either. After all, it's not their bitcoin that's being messed around with, and other people are free to do with their bitcoin as they please. The meaningless text will be considered weird, but fine too. So they'll confirm the transactions as valid, and forward it to other nodes.
However, Segregated Witness-enabled nodes (lets call them “new nodes”) will notice something else. They will see the otherwise meaningless text in the scriptPubKey, but not consider it meaningless at all. Instead, new nodes will recognize this piece of text as another – very special – type of output.
Much like typical outputs, this new type of output will require one or several signatures to unlock the bitcoin. But unlike typical outputs, this new type of output will not require the signature to be included in the scriptSig of a subsequent transaction. Instead, it will require the signature to be included in a completely new part of the transaction: the Segregated Witness.
This Segregated Witness is basically an “add-on” that carries signatures and some additional data. Importantly, Segregated Witnesses are completely ignored by old nodes, but recognized by new nodes. Moreover, the data they carry is not hashed along with the other parts of a transaction into the transaction ID.
As such, both old nodes and new nodes will consider transactions containing signatures in the Segregated Witness valid. Old nodes validate them because from their perspective these transactions don't require a signature at all (and they don't see one), and new nodes validate them because the required signature is in the Segregated Witness. And since both old and new nodes hash the transaction data into the same transaction ID, everyone agrees on the makeup of blocks, and, as such, on the structure of the entire blockchain.
(Note that it is important that all miners – or a very large majority – should use Segregated Witness in order to prevent double spends and chain forks, or none of them should. If all miners do use Segregated Witness, old nodes on the network might wonder why some transactions aren't included in blocks, but since it was always up to miners to decide which transactions to include, and since these are not their transactions, they won't mind, either.)
But there is one problem: If signatures have no effect on the makeup of the blockchain, the blockchain no longer serves as proof that the correct signatures were included in transactions.
To make sure that signatures are embedded in the blockchain regardless, a Segregated Witness-enabled miner adds a trick, too. Rather than creating only a Merkle Tree out of all of the transactions, it also creates a Merkle Tree out of all Segregated Witnesses, to mirror the transaction tree. The Segregated Witness Merkle Root, then, is included in the input field of the coinbase transaction. As such the Segregated Witness Merkle Root changes the transaction data of the coinbase transaction, its transaction ID, therefore influences the block header and, ultimately, the makeup of the blockchain.
Wuille’s Segregated Witness proposal allows signatures to be removed from Bitcoin transactions, while maintaining Bitcoin's immutability, and without breaking any of the existing consensus rules.
The second part of this three-part story will explore why Segregated Witness is actually useful.