234-trees

What you should focus on:

The construction procedure. Practice building 234-trees with various more or less random sequences of keys. Understand why a 234-tree remains balanced.


One of the problems with binary search trees is that during construction, non-random input easily causes severe imbalance to the detriment of search times. What is required is a structure that remains balanced during construction. 2-3-4 trees are a solution to this problem.

The name comes from the nodes of the tree. There are three kinds:

Construction consists of a Search&Insert procedure This entails inserting a new key, after a failed search, at the position the key would have been found had it been in the tree. The construction differs from that of a binary search tree in that the search procedure first checks the number of keys and then their successive values at each node in the search path, before moving on down the tree. Failure occurs when a leaf (nil) is encountered upon which the new key is inserted into the parent node and an extra link is added, increasing the number of keys and links it holds by one.

Because 4-nodes already hold the maximum number of keys, these are broken up. The way in which this is carried out is the secret to the balancing act of this data structure. During the search down the tree, as soon as a 4-node is encountered it is split as follows:

First the centre key is promoted - moved up to the parent node (this will be a new node if it is the root that is being split.). The parent will always be a 2-node or a 3-node since any four node would already have been broken up during the journey down the tree.


Then the two remaining keys are assigned to two new 2-nodes which are linked to the parent by the links on either side of the promoted key. The left two links of the old 4-node are assigned to the left 2-node, and the right two links of the old 4-node are assigned to the right 2-node. The old 4-node can now be discarded.

Once a four node has been split, the Search&Insert procedure may continue as before.

 

In the figure the splitting procedure turns the parent node into a 4-node. This will be split only when a new search passes it.

Because the tree grows towards the root and not out at the leaves, the construction procedure just described guarantees that the balance of the tree will never be disrupted by any number of insertions.

How then does deletion affect the tree?

This will be dealt with at the lecture on RedBlack trees.


This document is a complement to the literature, which is not always as clear as it might be. More on 234-trees is to be found in Sedgewick: 215-219, and much more is available on the net. Search for '234-trees' or 'balanced trees'

http://godzilla.cs.nwu.edu/academics/courses/c11/notes/balanced-trees.html