Bitcoin Magazine

Show Menu
bitcoin-blockchain

Cryptographic Code Obfuscation: Decentralized Autonomous Organizations Are About to Take a Huge Leap Forward

Disclosure: the author of this article is involved with the Ethereum project.

There have been a number of very interesting developments in cryptography in the past few years. Satoshi’s blockchain notwithstanding, perhaps the first major breakthrough after blinding and zero-knowledge proofs is fully homomorphic encryption, a technology which allows you to upload your data onto a server in an encrypted form so that the server can then perform calculations on it and send you back the results all without having any idea what the data. In 2013, we saw the beginnings of succinct computational integrity and privacy (SCIP), a toolkit pioneered by Eli ben Sasson in Israel that lets you cryptographically prove that you carried out some computation and got a certain output. On the more mundane side, we now have sponge functions, an innovation that substantially simplifies the previous mess of hash functions, stream ciphers and pseudorandom number generators into a beautiful, single construction. Most recently of all, however, there has been another major development in the cryptographic scene, and one whose applications are potentially very far-reaching both in the cryptocurrency space and for software as a whole: obfuscation.

The idea behind obfuscation is an old one, and cryptographers have been trying to crack the problem for years. The problem behind obfuscation is this: is it possible to somehow encrypt a program to produce another program that does the same thing, but which is completely opaque so there is no way to understand what is going on inside? The most obvious use case is proprietary software – if you have a program that incorporates advanced algorithms, and want to let users use the program on specific inputs without being able to reverse-engineer the algorithm, the only way to do such a thing is to obfuscate the code. Proprietary software is for obvious reasons unpopular among the tech community, so the idea has not seen a lot of enthusiasm, a problem compounded by the fact that each and every time a company would try to put an obfuscation scheme into practice it would quickly get broken. Five years ago, researchers put what might perhaps seem to be a final nail in the coffin: a mathematical proof, using arguments vaguely similar to those used to show the impossibility of the halting problem, that a general purpose obfuscator that converts any program into a “black box” is impossible.

At the same time, however, the cryptography community began to follow a different path. Understanding that the “black box” ideal of perfect obfuscation will never be achieved, researchers set out to instead aim for a weaker target: indistinguishability obfuscation. The definition of an indistinguishability obfuscator is this: given two programs A and B that compute the same function, if an effective indistinguishability obfuscator O computes two new programs X=O(A) and Y=O(B), given X and Y there is no (computationally feasible) way to determine which of X and Y came from A and which came from B. In theory, this is the best that anyone can do; if there is a better obfuscator, P, then if you put A and P(A) through the indistinguishability obfuscator O, there would be no way to tell between O(A) and O(P(A)), meaning that the extra step of adding P could not hide any information about the inner workings of the program that O does not. Creating such an obfuscator is the problem which many cryptographers have occupied themselves with for the last five years. And in 2013, UCLA cryptographer Amit Sahai, homomorphic encryption pioneer Craig Gentry and several other researchers figured out how to do it.

Does the indistinguishability obfuscator actually hide private data inside the program? To see what the answer is, consider the following. Suppose your secret password is bobalot_13048, and the SHA256 of the password starts with 00b9bbe6345de82f. Now, construct two programs. A just outputs 00b9bbe6345de82f, whereas B actually stores bobalot_13048 inside, and when you run it it computes SHA256(bobalot_13048) and returns the first 16 hex digits of the output. According to the indistinguishability property, O(A) and O(B) are indistinguishable. If there was some way to extract bobalot_13048 from B, it would therefore be possible to extract bobalot_13048 from A, which essentially implies that you can break SHA256 (or by extension any hash function for that matter). By standard assumptions, this is impossible, so therefore the obfuscator must also make it impossible to uncover bobalot_13048 from B. Thus, we can be pretty sure that Sahai’s obfuscator does actually obfuscate.

So What’s The Point?

In many ways, code obfuscation is one of the holy grails of cryptography. To understand why, consider just how easily nearly every other primitive can be implemented with it. Want public key encryption? Take any symmetric-key encryption scheme, and construct a decryptor with your secret key built in. Obfuscate it, and publish that on the web. You now have a public key. Want a signature scheme? Public key encryption provides that for you as an easy corollary. Want fully homomorphic encryption? Construct a program which takes two numbers as an input, decrypts them, adds the results, and encrypts it, and obfuscate the program. Do the same for multiplication, send both programs to the server, and the server will swap in your adder and multiplier into its code and perform your computation.

However, aside from that, obfuscation is powerful in another key way, and one which has profound consequences particularly in the field of cryptocurrencies and decentralized autonomous organizations: publicly running contracts can now contain private data. On top of second-generation blockchains like Ethereum, it will be possible to run so-called “autonomous agents” (or, when the agents primarily serve as a voting system between human actors, “decentralized autonomous organizations”) whose code gets executed entirely on the blockchain, and which have the power to maintain a currency balance and send transactions inside the Ethereum system. For example, one might have a contract for a non-profit organization that contains a currency balance, with a rule that the funds can be withdrawn or spent if 67% of the organization’s members agree on the amount and destination to send.

Unlike Bitcoin’s vaguely similar multisig functionality, the rules can be extremely flexible, for example allowing a maximum of 1% per day to be withdrawn with only 33% consent, or making the organization a for-profit company whose shares are tradable and whose shareholders automatically receive dividends. Up until now it has been thought that such contracts are fundamentally limited – they can only have an effect inside the Ethereum network, and perhaps other systems which deliberately set themselves up to listen to the Ethereum network. With obfuscation, however, there are new possibilities.

Consider the simplest case: an obfuscated Ethereum contract can contain a private key to an address inside the Bitcoin network, and use that private key to sign Bitcoin transactions when the contract’s conditions are met. Thus, as long as the Ethereum blockchain exists, one can effectively use Ethereum as a sort of controller for money that exists inside of Bitcoin. From there, however, things only get more interesting. Suppose now that you want a decentralized organization to have control of a bank account. With an obfuscated contract, you can have the contract hold the login details to the website of a bank account, and have the contract carry out an entire HTTPS session with the bank, logging in and then authorizing certain transfers. You would need some user to act as an intermediary sending packets between the bank and the contract, but this would be a completely trust-free role, like an internet service provider, and anyone could trivially do it and even receive a reward for the task. Autonomous agents can now also have social networking accounts, accounts to virtual private servers to carry out more heavy-duty computations than what can be done on a blockchain, and pretty much anything that a normal human or proprietary server can.

Looking Forward

Thus, we can see that in the next few years decentralized autonomous organizations are potentially going to become much more powerful than they are today. But what are the consequences going to be? In the developed world, the hope is that there will be a massive reduction in the cost of setting up a new business, organization or partnership, and a tool for creating organizations that are much more difficult to corrupt. Much of the time, organizations are bound by rules which are really little more than gentlemen’s agreements in practice, and once some of the organization’s members gain a certain measure of power they gain the ability to twist every interpretation in their favor.

Up until now, the only partial solution was codifying certain rules into contracts and laws – a solution which has its strengths, but which also has its weaknesses, as laws are numerous and very complicated to navigate without the help of a (often very expensive) professional. With DAOs, there is now also another alternative: making an organization whose organizational bylaws are 100% crystal clear, embedded in mathematical code. Of course, there are many things with definitions that are simply too fuzzy to be mathematically defined; in those cases, we will still need some arbitrators, but their role will be reduced to a limited commodity-like function circumscribed by the contract, rather than having potentially full control over everything.

In the developing world, however, things will be much more drastic. The developed world has access to a legal system that is at times semi-corrupt, but whose main problems are otherwise simply that it’s too biased toward lawyers and too outdated, bureaucratic and inefficient. The developing world, on the other hand, is plagues by legal systems that are fully corrupt at best, and actively conspiring to pillage their subjects at worst. There, nearly all businesses are gentleman’s agreements, and opportunities for people to betray each other exist at every step. The mathematically encoded organizational bylaws that DAOs can have are not just an alternative; they may potentially be the first legal system that people have that is actually there to help them. Arbitrators can build up their reputations online, as can organizations themselves. Ultimately, perhaps on-blockchain voting, like that being pioneered by BitCongress, may even form a basis for new experimental governments. If Africa can leapfrog straight from word of mouth communications to mobile phones, why not go from tribal legal systems with the interference of local governments straight to DAOs?

Many will of course be concerned that having uncontrollable entities moving money around is dangerous, as there are considerable possibilities for criminal activity with these kinds of powers. To that, however, one can make two simple rebuttals. First, although these decentralized autonomous organizations will be impossible to shut down, they will certainly be very easy to monitor and track every step of the way. It will be possible to detect when one of these entities makes a transaction, it will be easy to see what its balance and relationships are, and it will be possible to glean a lot of information about its organizational structure if voting is done on the blockchain. Much like Bitcoin, DAOs are likely far too transparent to be practical for much of the underworld; as FINCEN director Jennifer Shasky Calvery has recently said, “cash is probably still the best medium for laundering money”. Second, ultimately DAOs cannot do anything normal organizations cannot do; all they are is a set of voting rules for a group of humans or other human-controlled agents to manage ownership of digital assets. Even if a DAO cannot be shut down, its members certainly can be just as if they were running a plain old normal organization offline.

Whatever the dominant applications of this new technology turn out to be, one thing is looking more and more certain: cryptography and distributed consensus are about to make the world a whole lot more interesting.

BTC: 1FxkfJQLJTXpW6QmxGT6oF43ZH959ns8Cq

LTC: LaBhvWiAP7msku6w8QSQ5G7omVWMF3uxJC

By

Vitalik Buterin is a co-founder of Bitcoin Magazine who has been involved in the Bitcoin community since 2011, and has contributed to Bitcoin both as a writer and the developer of a fork of bitcoinjs-lib, pybitcointools and multisig.info, as well as one of the developers behind Egora. Now, Vitalik's primary job is as the main developer of Ethereum, a project which intends to create a next-generation smart contract and decentralized application platform that allows people to create any kind of decentralized application on top of a blockchain that can be imagined.

Get Top Stories Weekly

We respect your email privacy

  • Kay0r

    Wonderful insight of this mind-boggling, milestone new technology
    You’re the man!

  • behindtext

    Hello Vitalik!

    This is an interesting article but there is a basic problem that I cannot get past:

    You cite the simplest example of a conditional contract that is fundamental to the operation of a DAO: “Consider the simplest case: an obfuscated Ethereum contract can contain a private key to an address inside the Bitcoin network, and use that private key to sign Bitcoin transactions when the contract’s conditions are met.”. The security implications of embedding a private key and having it make a signature in a public program, i.e. the contract, are rather severe: it could lead to leakage of information that could allow an attacker to infer that private key and take over a given address.

    How does ETH prevent this leakage of information about private keys in contracts? When executing a public contract, an attacker has full visibility into all aspects of the contract execution, e.g. they can watch the memory as it executes. I would be very interested to see an example of how ETH handles this along with some information about contract size and execution time needed to provide sufficient protection against such an attack.

    Regards,
    Jake

    • Vitalik Buterin

      That’s the exact problem that this obfuscation breakthrough has pushed us a long wat toward solving.

    • gubatron

      that’s the beauty of the breakthrough in obfuscation technology.
      in the past obfuscation was like a bump in the road, it was only a matter of time until you’d have a readable version of the obfuscated code. Now it’s (supposedly) pretty much impossible, so you can put all sorts of sensitive data in there safely, nobody should be able to figure out what’s in there.

  • http://www.gwern.net/ gwern

    Will obfuscated techniques ever be practical on Ethereum? IIRC, the current obfuscated constructions involved a big blowup in code size and presumably also entails large runtime costs, which is going to be painful on a system which charges programs by the step.

  • Justin

    I for one welcome our new decentralised autonomous overlords.

  • Eric

    What if the function you wanted to compute actually required “bobalot_13048″? For example, what if it returned Sha256(“bobalot_13048″ + input)?

    Certainly, all other programs that compute the same thing would contain some form of the string “bobalot_13048″. And therefore the secret phrase may well be retrievable from the obfuscated program in a computationally efficient way.

    So, in principal, we’re not really any closer to secretly storing a private key in a public program in any useful way.

    But what it does seem to do is “best possible obfuscation”. Basically, take A that computes something. Take B that is the best possible obfuscation of A and computes exactly the same thing. Now P(A) and P(B) are indistinguishable (through feasible computation). Which means, you can’t really tell anything about P(A) that you couldn’t tell about P(B) and presumably, you can’t tell anything about P(B) that you couldn’t tell about B. Therefore you can’t tell anything about P(A) that you couldn’t tell about the best possible obfuscation of A.

    So it’s not the holy grail, but it looks like it is the next best thing.

  • oillio

    If this were to actually work, why would you want to run it on Ethereum? If you have an obfuscated program that only signs a transaction when the input data matches certain criteria, what advantage does running it on Ethereum vs running it on a single computer provide?
    An obfuscated program like the one you describe is necessarily a pure function. For any given set of inputs, it will produce a single given output. I don’t see any way you can prove to such a program that it is actually running on the real Ethereum blockchain and not on an attacker’s private fork?

    • https://shiresilver.com/ Ron Helwig

      Imagine you want to have a website, say something like Amnesty International or the Silk Road or the Pirate Bay, that you want to prevent governments from shutting down or otherwise interfering with. You want it to be able to scale up as needed. Currently the best you can get is to use a “cloud server” in a data center in a country you hope will defend you against aggressor nations. But as the case of the Silk Road shows, that is not enough.

      Technology like this will allow you to spread out the processing in a manner similar to how bit-torrent spreads out files, allowing you to have a web server that isn’t hosted on a machine that can be shut down and will scale up naturally as the number of site visitors increases.

      Not only that but they should also be more protected against hackers, since there will be no server to invade, no idiot users installing trojans, no lazy sysadmins forgetting to upgrade security packages, etc.

  • Lani

    The developing world, on the other hand, is plagues by legal systems that are fully corrupt at best, and actively conspiring to pillage their subjects at worst.” I think that should be “plagued” instead of “plagues”

  • liberal

    The comparison between ‘third world’legal systems and ‘first world’ is completely clueless.

  • loganspappy

    There’s something more important he does not mention: the algorithms (contracts) on the blockchain, not just the data or identities are encrypted even as they are public agreed-upon, but the algorithms themselves that the blockchain servers are running and verifying can be completely hidden from public view even as the results are enforced on the blockchain.