Why blockchain is a terrible idea for applications

Here we go again

Much like when the .com bubble burst, the cryptocurrency/blockchain market is headed towards an extinction event. There’s a reckoning to come and even companies with good ideas will discover issues during project execution. It turns out it’s a lot easier to describe in a whitepaper your plan to replace AWS, AirBnB or Uber with your blockchain project than actually pulling it off. These are difficult markets to attack and showing up to the fight with inferior UX, a fraction of the user-base, and the one shining “advantage” of decentralization won’t be enough.

Why blockchains aren’t suitable for almost all projects

Blockchains are not suitable for almost all projects; they’re slower, more complicated to develop applications for, and there are many security issues. In addition, decentralized applications running on the blockchain are out of your control — updates or changes require consensus.

All of which results in a minimum viable product that’s a lot harder to build or much more insecure. The upshot of this is the blockchain version of anything is either going to take significantly more effort to build or the team will have to cut corners that should not have been cut to get to a MVP faster.

Using a blockchain, but not decentralized

Some projects address blockchain shortcomings by using databases to store data (e.g., CryptoKitties, Bee Token). So, for example, a small amount of data goes onto a blockchain, but the majority is stored in a database (stored off-chain) with references back to the blockchain.

This hybrid approach may solve some problems, but others appear. What happens if the server goes offline or you lose the database? Also, anytime databases are leveraged, core centralized problems return: databases can be attacked, data stored in the database can be changed, or get compromised.

If 90% of data is stored off-chain in a database what have you really built and why have you built it on the blockchain?

The only thing you’ve done is tie your data to a public ledger and verify a link from the blockchain to the database existed at a certain point. But what‘s the benefit besides creating a digital notary about when something was added and evidence that something hasn’t been tampered with since? We can’t use that information to figure out any meaning at that point in time as it’s usually a one-way-hash of the data stored elsewhere.

Similarly, many projects use IPFS with a cluster of distributed nodes to host the data rather than keep it on the blockchain. This on the face of it feels decentralized, but there’s often no thought put in to the economics or incentive around keeping these nodes online. If the data is only “distributed” among a handful of participants, it’s really not a significant improvement over just using a centralized database.

Blockchain burdened by extra work

If transaction speed is critical to your application’s success then you should think databases, not blockchains.

Some blockchains claim they can perform up to 500,000 tx/sec and while this speed may be fast compared to other blockchains, it’s slow compared to a database running on just a single machine (not a cluster). For example, a big SQL server is going to eat those numbers for lunch and will crush the capacity of Bitcoin, Ethereum or whatever blockchain network.

Even if future blockchains do increase in speed, fundamentally it is highly unlikely it will ever be faster than a database because it needs to do extra work. Most ‘classic’ blockchains combine two distinct activities together: mining and transaction processing.

Bitcoin is the best example; once you have enough transactions you start mining that block, and then need to solve the associated challenge to prove you’ve expended the requisite effort to “secure” the block. Requiring this proof of work ensures that the blockchain is proof against a minority party (in terms of compute power) derailing the consensus rules. Regardless of the blockchain, when mining is coupled with transaction processing the system can only go as fast as mining permits.

Other blockchain architectures elect or designate block producers so they decouple the Sybil attack countermeasures from the act of producing a block. This leads to faster block production rates, but there’s still a lot of work every consumer of that block has to do; signature validation, merkle tree checking, verifying hashes, etc. If you skip any of these steps for efficiency’s sake you’re open to being fed false or altered information, which will corrupt your view of the distributed ledger.

With 40 years of database research, compared to approximately ten for blockchains, databases will be faster for the foreseeable future.

To successfully design a blockchain-powered system that can actually be called decentralized it’s crucial to very carefully consider what you store on-chain and what you can avoid storing on-chain. At Helium we really want to let the device owners store their own information however they want and just use the blockchain as a routing and settlement layer for tracking gateway identity. Devices simply indicate their identity and we use prefix routing on their identifier to route the packet to the right place. We don’t store information about the actual device on-chain. The fastest blockchain transactions are the ones you don’t make.

Loss of control

Fundamentally when you put something on the blockchain that means you’re effectively releasing control. Are you willing to give up control to become truly decentralized? If so, you need to prepare for a number of questionable uses of the system. You don’t get to say I’m banning that user. Or I’m deploying a new version of the software to fix a bug. If you’re not willing, then why choose to use a blockchain?

If you are truly committed to a decentralized system, depending on the type of governance, to take action buy in is typically required from the majority of participants. This can take time and be cumbersome. As an example, we’ve seen contention over the Bitcoin 1 MB block size. The issue turned into a huge, very polarized debate and ended up with Bitcoin Cash becoming a fork. Technical decisions get weaponized as political ones.

In Bitcoin, for miners to vote they need to put a small amount of information in a piece of an unused block as a voting mechanism. Lots of chains recognize this as an issue and look to improve it with better on chain governance. However, now you have to solve a distributed voting problem with weak identities. If you choose to let only the miners vote then you disenfranchise the users and the network becomes run for-and-by miners.

Finally, another governance model is “whatever the development team says”. Indeed, this is often the early model used, somewhat by necessity (blockchains are software, they don’t launch bug-free). However, if you don’t have a plan or the desire to move beyond this eventually you’ll hit a point where you’re disenfranchising your users and they’ll push back.

Bitcoin has over 70 chain forks and probably several hundred codebase forks. This reflects not only Bitcoin’s popularity but also the fact that it has a fair miner and developer-centric governance.

When users don’t feel their needs are being met, there are several ways they can push back: they can refuse to update their software, they can fork the codebase or they can jump ship to something else that meets their needs better. None of these are desirable from the project creator’s standpoint so you should try to forestall them by making sure the users feel like their concerns and needs are being considered.

Bitcoin has over 70 chain forks and probably several hundred codebase forks. This reflects not only Bitcoin’s popularity but also the fact that it has a fair miner and developer-centric governance. An even more extreme example is Monero, Monero is a fork of Bytecoin, but it has completely obliterated Bytecoin in terms of market share because the community fork was more focused on features that users wanted to see and that the community did not trust the original developers of Bytecoin.

At Helium, this loss of control was actually something we wanted. Nobody wants to trust their information to a third party or have hardware they paid for stop working if a company goes out of business or on an “incredible journey”. We very intentionally designed a system that flipped the normal IoT ownership rules on their head to put the customer back in charge of their information and devices while still building a seamless global network. To that end, we needed a network that enabled devices that don’t trust each other to still collaborate safely. This is the killer feature of blockchains, they allow untrusted consensus to be built and systems to be resilient if any members of the community depart. We want to enable other companies and users to build their own gateways, implement their own end-nodes, their own wallet, their own consensus implementation, etc. We think the value of an “always on, everywhere” IoT network far outweighs the advantages of a single-vendor, proprietary network. After enabling the creation of that network, a profitable business model can be derived by providing services around that open ecosystem.

Increased complexity

What makes building blockchain applications complicated is you’re forced to use a less expressive way to write business logic for applications that need to run in an untrusted context.

There’s hundreds of millions of ETH locked up in smart contracts because of various programming errors.

Fuzzy business logic

To write blockchain applications you often need to work with something like a smart contract; and while the smart contract language will give you certain primitives, they may not be ones you’re familiar with. Solidity has all kinds of gotchas, trips, and traps. Mistype one character and your smart contract is vulnerable. There’s hundreds of millions of ETH locked up in smart contracts because of various programming errors.

In general, it’s not clear the industry is very good at writing smart contracts, and the support and tooling for writing them well is still a very fast moving area. With traditional legal contracts, contract disputes have ways to be resolved. A smart contract, almost by definition, cannot be wrong because “code is law”. However, if there’s an error in a smart contract, unlike with a regular contract, you can’t take a dispute to a court to decide intent vs implementation. If we can’t write regular software without bugs, we should be cautious about delegating final decisions to immutable code that we can’t fix or revert if an error is found later.

Implementing your business logic with these constraints can be very tricky. Smart contracts also usually cost tokens or “gas” to run and store data on the chain, so you have to be extremely careful how often you execute your code and how much storage it uses. At one point it was estimated storing 1Gb on the Ethereum blockchain would cost about 1 million dollars.

Even if you’re able to implement your business logic compactly and safely in a smart contract, now you have the problem of upgrading it. Often the solution is to have a shim contract that calls into the latest version of your “real” code. In fact, whole smart contract “operating systems” are emerging to give you access to smart contracts implementing common tasks. While this sounds great, we’ve seen several cases where a failure in a fundamental contract like this has been catastrophic.

At Helium we’ve chosen to avoid implementing smart contracts for now. Happily, our blockchain doesn’t require general purpose compute to be useful. We plan to implement some of the more useful smart contract patterns (Hash Time Locked Commitments, Multisig transactions, etc). If smart contract languages, runtimes, and tooling improve we’ll definitely revisit that decision, but for now we feel that it’s safer to provide a set of well-tested, predefined transaction types that deliver the features we anticipate needing.

Running without trust

One of the biggest challenges running in an untrusted context is how do I generate a random number in a smart contract? It’s actually very hard because if that random number is at all important to the execution of that smart contract, the miner is going to see it before anybody else. So with that prior knowledge, it’s possible they can take unfair advantage of the system including front-running the transaction.

How do you build these systems when effectively all your business logic is out there for everyone to see? Imagine building an application and anyone can look at the source code and state of the system at any time. Let’s say you’re running a public lottery on the blockchain; everybody pays in, whoever draws the random number gets paid. If you run the lottery server, and you’re honest (or regulated by the state gaming commission) it’s easy. As soon as everyone is running the lottery server how do you make that work without being exploited or front-run?

If you’re running code on a smart contract, which is effectively where you put your business logic for a distributed application (dApp), you have to figure out how to keep that secret from the people running the smart contract as part of the mining process so they don’t have privileged information. You need to consider what happens if a miner is looking for these kinds of transactions or smart contract executions. There are already examples of bad actors with privileged positions using knowledge of incoming transactions to take advantage of the system.

At Helium we are very interested in preventing the miner censoring transactions they don’t like or front-running them when it’s profitable. To that end, the consensus protocol we’ve chosen to implement, HoneyBadgerBFT, is a censorship resistant consensus protocol. It makes use of threshold encryption to ensure that miners have a harder time ignoring valid transactions they don’t want to include (because the transaction fee is too low, or because it might dilute their power or position) and we’re investigating ways to strengthen those properties further.

Security

Once you put information on a permissionless blockchain, everyone can see it. Some users may have some special local knowledge, eg. a decryption key, that can allow them to access private information in a public blockchain but, in general, what one user of the blockchain sees, everyone sees. This can make some things that are simple in the centralized model much harder when it’s done in a decentralized way on a public, permissionless blockchain. If a miner can use transaction information to predict the outcome of a smart contract, they can front-run that transaction by delaying the user’s transaction and putting a transaction they crafted to exploit their advance knowledge ahead of it in the blockchain. Imagine a miner seeing a large buy/sell order transaction come in on a decentralized exchange where they can use that advance knowledge to their advantage by buying/dumping right before the big trade is processed.

The immutability of the ledger can be a double-edged sword and you need carefully consider what you want to store and how you want to store it because there’s no undo button.

In general the stakes for putting information into a blockchain are much higher; once you put it in, you can’t take it back out. This means that even if you think what you’re doing today is secure you also have to think about what that information could mean in the future if a private key is lost or your security scheme is broken. The immutability of the ledger can be a double-edged sword and you need carefully consider what you want to store and how you want to store it because there’s no undo button.

And still we chose a blockchain

At Helium we knew a blockchain was a critical component to build a wireless network that was truly decentralized. To learn why we chose to build our own blockchain despite knowing the formidable challenges read my blog post here.