📝Implementation Details
Merkle Mountain Range implementation details for contract reviewers.
Last updated
Merkle Mountain Range implementation details for contract reviewers.
Last updated
The type of merkle tree implemented in the Vea inbox is sometimes referred to as a merkle mountain range (MMR). Without additional context, merkle trees are usually understood to be perfectly balanced binary trees with a fixed height. MMRs on the other hand grow in height as more leafs are inserted, and can be represented by a set of merkle subtrees, referred to as merkle "mountain ranges" due to their shape, see diagram below.
All leafs are double hashed to avoid second preimage attacks.
Merkle mountain ranges efficiently represent the state of all messages with a limited amount of data made available in the vea inbox contract stored in the inbox variable.
The inbox represents the 'mountain ranges' or subtrees of varying height. eg inbox[4] = root of a merkle tree of size 2^4 = 16 leaves.
The inbox state for the above merkle tree example with 7 messages is given by the table below.
Count | Inbox[2] | Inbox[1] | Inbox[0] |
---|---|---|---|
0b101 | H(1,4) | H(5,6) | H(7) |
where we use the notation H as a short-hand for:
H(1,4) = root of merkle tree representing messages m1, m2, m3, m4
H(5,6) = root of merkle tree representing messages m5, m6
H(7) = root of merkle tree representing message m7
H(n) represents the n-th leaf which is the double hash of the n-th message m_n.
H(n,m) represents an interior node of the tree which is the merkle root representing the leaves H(n), H(n+1), ..., H(m).
The parent of a pair of nodes is calculated by sorting & concatenating the nodes, and hashing the result.
for example
Note that we should sort hash pairs before hashing. eg. if H(2) < H(1), then we would concatenate in the opposite order.
Above we neglect the sorting notation for brevity.
Another example, showing the inbox state step by step for 7 insertions.
Count | Inbox[2] | Inbox[1] | Inbox[0] |
---|---|---|---|
0b001 | --- | --- | H(1) |
0b010 | --- | H(1,2) | "H(1)" |
0b011 | --- | H(1,2) | H(3) |
0b100 | H(1,4) | "H(1,2)" | "H(3)" |
0b101 | H(1,4) | "H(1,2)" | H(5) |
0b110 | H(1,4) | H(5,6) | "H(5)" |
0b111 | H(1,4) | H(5,6) | H(7) |
Note some properties about the on bits ("1s") in count:
The fist set bit of count corresponds to the modified inbox index.
The on bits of count indicate the minimal data to represent the tree.
For example, when count = 0b010, inbox[0] is set to H(1). However inbox[1] = H(1,2) which implicitly includes H(1), so inbox[0] is not needed to encode the tree. From the perspective of data availability, we can forget about H(1) in the slot represented by inbox[0]. For that reason it is represented in quotation marks and can be overwritten in future steps, reusing the dirty inbox slot for efficiency.
To calculate the root, we hash together the data in each inbox slot corresponding to an on bit in count, from the lowest index to the highest.
Count | Inbox[2] | Inbox[1] | Inbox[0] | root |
---|---|---|---|---|
0b001 | --- | --- | H(1) | H(1) |
0b010 | --- | H(1,2) | "H(1)" | H(1,2) |
0b011 | --- | H(1,2) | H(3) | H( H(3) , H(1,2) ) |
0b100 | H(1,4) | "H(1,2)" | "H(3)" | H(1,4) |
0b101 | H(1,4) | "H(1,2)" | H(5) | H( H(1,4) , H(5) ) |
0b110 | H(1,4) | H(5,6) | "H(5)" | H( H(1,4) , H(5,6) ) |
0b111 | H(1,4) | H(5,6) | H(7) | H(H(1,4),H(H(5,6),H(7))) |
Here are some useful resources to better understand merkle mountain ranges. The notation and indices used in the below resources differs from the notation used in this document. Resources are meant to be illustrative and supplemental.