Sparse merkle trees
Introduction
Sparse Merkle Trees (SMTs) are a cryptographic data structure used primarily in blockchain systems and other decentralized applications to efficiently prove the inclusion or absence of key-value pairs in a large dataset. They are particularly useful in situations where the dataset is too large to store in its entirety, but where proofs of inclusion or absence are required.
Structure of a Sparse Merkle Tree
A Sparse Merkle Tree is a binary tree where each leaf node represents a key-value pair, and each non-leaf node represents the hash of its child nodes. Unlike traditional Merkle Trees where every leaf node contains a key-value pair, in a Sparse Merkle Tree, leaf nodes only contain key-value pairs that are present in the dataset.
Nodes
At the bottom level, every node (also called leaf for the lowest level) is just the hash of its data , where is the data that is put into the leaf. In every level above, each node is the hash of the previous two nodes concatenated: . This is the repeated in every level up to the root.
Notice that the root therefore implicitly is a result of hashing all the nodes and leafs below the root. That means that if you change the value of any node or leaf in the tree, the root will change.
Keys
Notice the labels that each connection between any node has in the diagram above. It can either be 0 or 1 and every connection has one of these two labels assigned to it. If you walk the path to a leaf and write down each label you encounter one-by-one, you arrive at the key in bitwise form.
For example, the leaf from the diagram will have the key , which would be as a decimal number. Therefore, this would represent the key-value pair of encoded in a merkle tree.
Notice that as you go down the tree, every 0 you encounter corresponds to the left path, while every 1 does so for the right path.
To recap: Every leaf (remember, node at the very bottom) has a key (also called index), which is used to determine the path where this leaf is located inside the tree.
Merkle proofs
In a lot of applications, we want to proof to a verifier that the value with key is inside a merkle tree that has the root . Notice that the verifier only has the root , and nothing else. In this case, we can create a so-called βmerkle proofβ, which will proof that our value is indeed inside that tree.
For that, we send the verifier:
- the value
- the key
- The siblings , and
The verifier can now recursively use the siblings we provided to reconstruct the root and check that against the known root :
- He computes
- He re-computes the node by using the sibling and hashing them together
- He does the same in the next level with the sibling :
- Now he can do it again at the root-level
Now the verifier can check that the calculated equals the known root and therefore has verified that is indeed in that tree.