A note on implementing a binary Merkle Tree structure from a list of transactions a la Bitcoin white paper.
I’ve been interested in the blockchain technology, but only recently had I gone back into more serious reading. I started with the Bitcoin official whitepaper, which turned out to be one of (if not) the best reading material on Bitcoin and blockchain. I’ve been slowly working on an implementation of its Merkle tree on my free time. This serves as a study / note to self. Thus, I won’t attempt to cover theories and the basics (also because I’m learning). I’ve written code in Ocaml, but the focus is with what I have learned.
Normally, this is how one would go about creating a binary tree from a linked list (at least this is likely what one would encounter as a trivial BST problem):
list: A->B->C->D->E
bst:
D
/ \
B E
/ \
A C
In this, a binary tree is being “lifted” up from the center of the list to form the best balanced structure. The point being all the elements are known.
A classic Merkle Tree, albeit being a binary tree, works a bit differently. It is formed by recursively hashing the concatenation of the data of two nodes to form a parent until a single top-level node, known as the Merkle root, is reached.
In other words, the immediate parent node is always carrying a hashed result of its two immediate children’s hashes.
list: A->B->C->D
bst: (ABCD)
/ \
AB CD
/ \ / \
A B C D
Pretty ugly ASCII attempt, but hopefully enough to illustrate the point. Normally A is a hash string H(A), AB is a hash string H(H(A)+H(B)) and so on, but I won’t prematurely implement that because hashing is pretty trivial but makes playing with outputs unreadable.
Two interesting points to note are
- Number of input transactions must be a power of 2 or one less than a power of 2 (2, 4, 8, 16, … or respectively any number on this list minus one).
- The widow element
E
in the case of the transactions’ size being one less than the power of 2 will be “married” to its newly cloned pairE’
. This isn’t a requirement, but this is how Bitcoin does.
(ABCDEFGG')
/ \
ABCD EFGG'
/ \ / \
AB CD EF GG'
/ \ / \ / \ / \
A B C D E F G G'
The bottomline is that the list isn’t being “lifted” up to form a root at the center from the beginning like the previous binary tree example. The list of transactions (txs
) is being interspersed with a concatenated result of the two adjacent tx
(again, it’s actually a hashed result, but concatenated strings are simpler to read) first, then insert a result of all the tx
combined in the middle before lifting a tree up.
list : A->B->C->D->E
list' : A->AB->B->BC->C->CD->D->DE->E->EE'->E'
list'' : A->AB->B->BC->C->ABCDEE'->CD->D->DE->E->EE'->E'
However, without thinking too much about the O-notation, intersperse operation sounds costly, not to mention double-work. If the end result is a tree, recursion is a natural way to go.
This is a walk-through of the steps roughly:
# Start with very minimal listA->B->C->DA Node has:
- data
- left child (another Node or empty)
- right child (another Node or empty)Let's create a shorthand for it -> Node(data, left, right)======= BASE CASE ========List is empty...Likely it's the end of a level, getting ready to
go up another and switching to a new list of parents.
This happens at the end of every level, but perhaps we
could design so this shouldn't happen.When list has just one element...The list had been reduced to one element. If at every
point in time we are always looking at a pair of nodes to form a parent, then this outlying single element must have been a widow.
In this case, we should create a clone and a parent based on
it and its clone, then jumps up to a new level.Another case is that last element is a Merkle Root.It'd be useful to maintain a number of levels (left) of the expected tree derived from the given initial list of transactions.======== PRE-STAGE ========transform a list of transactions (in this case strings) to child Nodes using the transaction's value as data. so"A" -> Node ("A", empty, empty)All Nodes will start with empty left and right.======= FIRST PASS ========At this point we have a list of Nodes. When we say A it actually means Node ("A", left, right) and when we say A + B it actually means getting data "A" and "B" from Node A and B and concatenate them.Seeing A and B? Yes!
- Let their parent's data be A + B
- Create Node AB -> Node(A+B, A, B)
- List is reduced to C->DSeeing C and D? Yes!
- Let their parent's data be C + D
- Create Node CD -> Node(C+D, C, D)
- List is empty, time to switch up to work on the list of parents.
- Decrement number of levels======= SECOND PASS ========Now, a new list is AB->CD (parent of A, B and C, D respectively). And we reiterate.Seeing AB and CD? Yes!
- Let their parent's data be AB + CD
- Create Node ABCD -> Node(AB+CD, AB, CD)
- List is empty.======== LAST PASS =========Now the list has only one Node ABCD.At this point, we check the number of levels we've been maintaining. If it's 0, then we've reached a Merkle Root.======== EDGE CASE =========The edge case, of course, would be a widow Node.
which only occurs at the bottom-most (first pass) level.Seeing E but no partner? Yes!
- Let the parent data be E + E
- Create Node E' -> Node(E, empty, empty) # a clone
- Create Node EE' -> Node(E+E, E, E')
One obvious problem is before the end of every pass, the list is reduced to just a single element. We must have a way to tell if that single element is already a Merkle root and that we must stop the recursion or that it is just about to reach the end of an intermediate level.
The simplest way is to calculate a number representing the levels of the expected tree prior to entering the recursion. With the tactic of cloning a widow transaction node, the starting number of nodes is always even. And we should always guard it so a tree is only created when the size of the transactions list is a power of 2.
Below is how far I’ve come. I’ve also written an API server for future front end visualization. Let me know what you think.