Merkle Trees

The adventitious prop roots of an Indian Banyan Tree.

Merkle trees are a fundamental part of blockchain technology. A merkle tree is a structure that allows for efficient and secure verification of content in a large body of data. This structure helps verify the consistency and content of the data. Merkle trees are used by both Bitcoin and Ethereum.

How do Merkle trees work?

A Merkle tree summarizes all the transactions in a block by producing a digital fingerprint of the entire set of transactions, thereby enabling a user to verify whether or not a transaction is included in a block.

Merkle trees are created by repeatedly hashing pairs of nodes until there is only one hash left (this hash is called the Root Hash, or the Merkle Root). They are constructed from the bottom up, from hashes of individual transactions (known as Transaction IDs).

Each leaf node is a hash of transactional data, and each non-leaf node is a hash of its previous hashes. Merkle trees are binary and therefore require an even number of leaf nodes. If the number of transactions is odd, the last hash will be duplicated once to create an even number of leaf nodes.

The Merkle Tree of transactions A, B, C & D.

Let’s look at an example of four transactions in a block: A, B, C, and D. Each of these is hashed, and the hash stored in each leaf node, resulting in Hash A, B, C, and D. Consecutive pairs of leaf nodes are then summarized in a parent node by hashing Hash A and Hash B, resulting in Hash AB, and separately hashing Hash C and Hash D, resulting in Hash CD. The two hashes (Hash AB and Hash CD) are then hashed again to produce the Root Hash (the Merkle Root).

This process can be conducted on larger data sets, too: consecutive blocks can be hashed until there is only one node at the top. Hashing is usually conducted using the SHA-2 cryptographic hash function, though other functions can also be used.

The Merkle Root summarizes all of the data in the related transactions, and is stored in the block header. It maintains the integrity of the data. If a single detail in any of the transactions or the order of the transactions changes, so does the Merkle Root. Using a Merkle tree allows for a quick and simple test of whether a specific transaction is included in the set or not.

The entire dataset doesn’t need to be downloaded to verify the integrity of Transaction 5.

A Merkle tree differs from a hash-list in that with a Merkle tree, one branch can be downloaded at a time and the integrity of each branch can be immediately verified, even if the rest of the tree is not yet available. This is advantageous because files can be split up into very small data blocks, such that only small blocks need to be downloaded again if the original version is damaged.

Uses

Using a Merkle tree can significantly reduce the amount of data that a trusted authority has to maintain for verification purposes. It separates the validation of the data from the data itself. A Merkle tree can reside locally, or on a distributed system.

Merkle trees have three major benefits:

1. They provide a means to prove the integrity and validity of data

2. They require little memory or disk space as the proofs are computationally easy and fast

3. Their proofs and management only require tiny amounts of information to be transmitted across networks

The ability to prove that a log is complete and consistent is essential to blockchain technology and the general ledger concept. Merkle trees help verify that later versions of a log include everything from an earlier version and that all data is recorded and presented in chronological order. Proving that a log is consistent requires showing that no previous records have been added, altered or tampered with, and that the log has never been branched or forked.

Merkle trees benefit miners and users on the blockchain. A miner can calculate hashes progressively, as the miner receives transactions from peers. A user can verify parts of blocks individually and can check individual transactions using hashes of other branches of the tree.

Simplified Payment Verification (SPV)

Simplified Payment Verification (SPV) is a method of verifying if particular transactions are included in a block without downloading the entire block. Merkle trees are used extensively by SPV nodes.

SPV nodes do not have data from all transactions in a block. They only download block headers. Merkle trees enable SPV nodes on the blockchain to check if miners have verified the transactions in a block without downloading all the transactions in a block. This method is currently used by some lightweight Bitcoin clients.

Ethereum

Ethereum uses three different Merkle Roots in each block:

1. The first root is of the transactions in the block

2. The second root represents the state

3. The third root is for transaction receipts

Ethereum uses a special type of hash tree called the ‘Merkle Patricia Tree’ (read more here:

https://github.com/ethereum/wiki/wiki/Patricia-Tree)

An Indispensable Tool on the Blockchain

Merkle trees are powerful and indispensable tools for miners and users on the blockchain. They are extremely robust and are at the heart of several peer-to-peer networks such as BitTorrent, Git, Bitcoin, and Ethereum.