With representatives from the Bitcoin Foundation currently meeting high-level officials from a number of US regulatory agencies, the topic of Bitcoin anonymity has once again taken center stage. In part as a deliberate effort to downplay Bitcoin’s privacy aspects to regulators, in part as a result of recent revelations from Edward Snowden about the scope of the NSA’s digital surveillance initiatives, and finally in part due to new research regarding the Bitcoin transaction graph itself, the current mood is that Bitcoin may be far less private than we thought. On August 14, researchers from George Mason University released a regulatory primer on Bitcoin heavily downplaying its anonymity. On August 26, Vice released an article “describing researchers’ success in de-anonymizing some Bitcoin transactions“, and a similar story ran on Businessweek the next day. However, there are many who see this lack of anonymity as a problem – and today, two Bitcoin developers in Spain have come up with a solution.
First, the problem. As commonly understood, Bitcoin’s privacy model is as follows. All Bitcoin transactions are public by necessity; otherwise it would be possible to release many separate transactions sending the same money multiple times, and the fraud would not be discovered until much later. Rather, the privacy of the Bitcoin system comes from the fact that, although transactions are public, the identities behind the transactions are not. Anyone looking at the blockchain can see “1Mcqmmnx send 1.2321 BTC to 1V1tAL”, but what they do not see is exactly who sent the bitcoins to whom. It could be grandma sending bitcoins to buy her grandson a new computer from the BitcoinStore, it could be a vendor cashing out of Silk Road, or it could simply be MtGox sending bitcoins to itself.
However, for the past several years researchers have been showing that this privacy model is not quite so perfect as it seems. In 2011, researchers Fergal Reid and Martin Harrigan released an analysis that, among other things, attempted to trace a 25,000 BTC theft from June 2011. The paper produced no results useful to law enforcement, but was able to follow the money a considerable distance in some places. More recently, researchers in Zurich expanded on Bitcoin de-anonymization techniques, and most recently in August a group of researchers from George Mason University and the University of California, San Diego released a paper entitled “A Fistful of Bitcoins: Characterizing Payments Among Men with No Names“, that appears to successfully follow the money from some Bitcoin thefts all the way to a Bitcoin exchange, which in theory can be asked by law enforcement to reveal the thief’s identity.
Bitcoin de-anonymization essentially relies on one fundamental insight. Given just a set of information about people and a set of information about Bitcoin transactions, with no information connecting the two, it is indeed very hard to figure out which addresses belong to which person. However, as soon as you have even one anchor – some Bitcoin transaction or address tied to a particular real person or event, from there it becomes possible to “follow the money” and gather up a load of other information as well. One might discover the Bitcoin address of that user’s employer, favorite businesses, customers, and much more – or, knowing more anchors, one might quickly figure out what that user’s favorite businesses are. Aside from simply “following the money”, there are also two more advanced tools that Bitcoin sleuths can use.
First, there is a concept called the closure. The closure of an address is defined recursively as follows:
- An address is in its own closure.
- If address A is in the closure, and there exists a transaction using coins from address A and address B as inputs, then address B is also in the closure.
To calculate the closure of an address, simply repeatedly apply the definition starting from that address until you stop adding new addresses. The power of this concept is the following: all addresses in a closure are almost certainly owned by the same user. If a transaction has multiple inputs, the reason is almost always that a user needed to send some amount of money somewhere but did not have the entire amount in one address, and so the wallet software had to select inputs from two or more different addresses to make the payment. With this tool, if you can show that even one address belongs to a particular person (eg. if you are a merchant accepting a payment from them) then you can potentially uncover most of their entire wallet.
Second, the paper also introduces a few heuristic algorithms for detecting what are known as “change addresses“. Change addresses are used by Bitcoin wallets to send extra money to when spending transaction outputs; for example, if you receive 50 BTC and then proceed to spend 1 BTC, the remaining 49 BTC goes to a change address freshly generated by your wallet. Doing this, rather than sending the 49 BTC back to the original address is done to increase privacy, and much of the paper deals with the difficult question of how one can automatically distinguish between change addresses and the intended output of a transaction. By doing this, the researchers were able to go much further than with closures alone, often following long chains of transactions hundreds of steps, and ultimately their research was successful: they were able to trace some stolen funds all the way to a Bitcoin exchange.
The main solution to the problem so far has been mixing services, such as the one at blockchain.info and the one integrated into Silk Road. A mixing service works as follows. A user provides the mixing service with a destination address, and is given an input address to send their bitcoins to. Thousands of users from around the world send their bitcoins into their mixer, the mixer internally shuffles them, and then sends to each user’s destination address the same quantity of bitcoins (but not the same bitcoins) that they sent in, minus a small fee. The link between the input address and the destination exists nowhere in the blockchain, and in theory the mixing service itself destroys this information as soon as mixing is complete. However, the anonymization comes at the cost of trust. Users need to trust the mixer not to reveal the link between the input and destination, and they also need to trust the mixer not to steal the bitcoins outright. Even worse, if the mixer did steal some bitcoins, there would be no way to prove that it did.
Now, Bitcoin developers Amir Taaki and Pablo Martin developed with a new solution to Bitcoin anonymity: a semi-decentralized, trust-free mixing system. The underlying idea of a decentralized mixer is not new, and all schemes proposed so far, including this one, work similarly. Some number of people, all wanting to mix some specific quantity of bitcoins (say 0.01 BTC), come together over some communications channel and construct a single transaction, with each person contributing 0.01 BTC as an input and receiving 0.01 BTC as an output. The order of the inputs and outputs is shuffled, so there is no information about which input corresponds to each output in the blockchain. The challenge is, however, making sure that the link between each participant’s input and output is not known to the other participants in the mix as well. One solution was presented by Oliver Coutu at the Bitcoin conference in May 2013, using secure multiparty computation to construct the transaction without anyone being able to see exactly which input or output any other person contributed. However, the underlying mathematics is complex (although not nearly so complex as Zerocoin), and so far no easily usable implementation has been created.
The solution that Taaki and Martin implemented is much simpler. The protocol is fully described on the project’s web page, and roughly works as follows:
- N people get together and agree to mix X bitcoins, and one of them sends the values N and X and a “room ID” to a central facilitator.
- Everyone sends a message containing the room ID and their destination address to the facilitator, using an anonymizing network like Tor to make the communication.
- The facilitator sends everyone an acknowledgement once all N people sent in their destination addresses.
- Everyone anonymously sends a message containing the room ID and their input address to the facilitator.
- The facilitator waits for everyone to send X BTC to their input address, and then constructs a transaction using these inputs and sending X BTC to each of the destination addresses. The facilitator then sends the transaction for everyone to sign.
- Everyone checks that the transaction sends the right amount of bitcoins to their destination address and, if the transaction checks out, anonymously sends their signature to the facilitator.
- The facilitator broadcasts the signed transaction.
As mentioned above, there is nothing particularly original about the protocol; in fact, Taaki and Martin first discovered the idea from a forum thread created by Bitcoin developer Gregory Maxwell describing the concept under the name of “CoinJoin”. Rather, the magic lies in the implementation. The pair were able to use the Bitcoin toolkit SX, developed by Taaki himself, to quickly implement the transaction handling side of the application within hours – a process which would have taken many times longer had they tried to write their own code or reuse code from a Bitcoin client like Armory or Electrum.
Where the implementation outshines previous attempts at accomplishing the same thing is ease of use. Taaki and Martin specifically created a simple graphical user interface; a user need only enter their input address, facilitator URL and output address, and the system handles everything automatically. “We’ve delivered usable software, simple for grandma (money goes in, money goes out), requires no blockchain or bitcoind, easy to install and trustless,” Taaki writes. For those interested, he has also produced a video depicting the entire process from start to finish; it only takes one minute to go through all the steps. “It’s experimental software,” Taaki says, “but it’s usable right now.” Anyone interested in running the mixer can simply download the source code and read the instructions here. Semi-decentralized, trustless Bitcoin anonymity has just been democratized.
Taaki and Martin’s implementation of CoinJoin has been criticized for being “centralized”, but in fact the level of centralization is trivial. The central party does not learn which input corresponds to which output (as people send their messages at different times during different phases), and does not have the opportunity to steal the bitcoins. If it tried replacing one of the destination addresses with its own, whoever got left out would notice and refuse to sign the transaction, causing the protocol to fail. The only possible failure mode simply results in the transaction not taking place. Furthermore, anyone can become a facilitator at essentially no startup cost; in fact, a completely decentralized setup might have one of the N people becoming the facilitator on the fly.
The mechanism also, perhaps unintentionally, accomplishes another objective: if widely used, it potentially defeats the utility of the “closure” concept. Closures rely on the idea that all inputs to a transaction are signed by the same person; here, the separate inputs to the mixing transaction are signed by complete strangers. Of course, the protocol as currently implemented can easily be accounted for – closure algorithms can deliberately avoid transactions with inputs of the same size. However, anonymizers themselves can fight back. Theoretically, one person can participate in the same mix multiple times, sending and receiving the bitcoins in different denominations (eg. sending in a single input of 0.03 BTC and getting three outputs of 0.01 BTC). From the other side, ordinary wallets can try to deliberately make their transactions look like anonymizing transactions; for example, a wallet provider might make their wallet always provide change in 0.01 BTC chunks, so every transaction will naturally appear to be a mix. There will be ways of detecting many of these things, but even still the closure’s status will drop from being a surefire way of discovering someone’s secret Bitcoin stash to just another heuristic tool.
How will regulators feel about this? If this mixer reaches a high level of popularity – for example, by being integrated into existing Bitcoin wallets, it will certainly make the effort to make Bitcoin traceable by regulating the exchanges more difficult. However, for those concerned about large corporations, dictatorships and multibillion dollar drug cartels, there is one saving grace: the more money you want to mix, the harder it will be. “While [a thief] might attempt to use a mix service to hide the source of the money,” the George Mason paper reads, “we again argue that these services do not currently have the volume to launder thousands of bitcoins.” If a billionaire tried to mix their funds with blockchain.info, the mix would become 99% them – making the mixer essentially useless for the billionaire. Furthermore, at sufficiently high volumes, transaction flows become much more distinguishable all on their own; there may be millions of people moving around 0.01 BTC, but there are only a few people with over 100,000 BTC, and the Bitcoin de-anonymization papers seen thus far have already had considerable success uncovering their stashes of bitcoins on the blockchain. In every case, the story is similar: more money, less privacy. Mixing $50 to hide your medical purchases from Target or your marijuana habit from the government, fine. $1.3 billion suddenly disappearing off the face of the earth? Bitcoin may well actually make that harder.