Introduction
Binary Search Trees (BST) are fundamental data structures that enable efficient searching, insertion, and deletion operations. However, standard BSTs can become unbalanced with certain input patterns, degrading performance to O(n). This guide covers BSTs, their limitations, and self-balancing variants that maintain optimal performance.
Binary Search Tree (BST)
A Binary Search Tree is a hierarchical data structure where each node has at most two children, with the left child containing values less than the parent and the right child containing values greater than the parent.
Properties
- Left subtree: Contains nodes with values less than the root
- Right subtree: Contains nodes with values greater than the root
- No duplicates: Each value appears once (can be modified for duplicates)
- In-order traversal: Produces sorted sequence
Implementation
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
class BinarySearchTree:
def __init__(self):
self.root = None
def insert(self, val):
"""Insert a value into the BST."""
if self.root is None:
self.root = TreeNode(val)
else:
self._insert_recursive(self.root, val)
def _insert_recursive(self, node, val):
if val < node.val:
if node.left is None:
node.left = TreeNode(val)
else:
self._insert_recursive(node.left, val)
else:
if node.right is None:
node.right = TreeNode(val)
else:
self._insert_recursive(node.right, val)
def search(self, val):
"""Search for a value in the BST."""
return self._search_recursive(self.root, val)
def _search_recursive(self, node, val):
if node is None or node.val == val:
return node
if val < node.val:
return self._search_recursive(node.left, val)
return self._search_recursive(node.right, val)
def delete(self, val):
"""Delete a value from the BST."""
self.root = self._delete_recursive(self.root, val)
def _delete_recursive(self, node, val):
if node is None:
return None
if val < node.val:
node.left = self._delete_recursive(node.left, val)
elif val > node.val:
node.right = self._delete_recursive(node.right, val)
else:
# Node to delete found
# Case 1: Leaf node
if node.left is None and node.right is None:
return None
# Case 2: One child
if node.left is None:
return node.right
if node.right is None:
return node.left
# Case 3: Two children
# Replace with inorder successor (minimum in right subtree)
successor = self._find_min(node.right)
node.val = successor.val
node.right = self._delete_recursive(node.right, successor.val)
return node
def _find_min(self, node):
while node.left is not None:
node = node.left
return node
def inorder(self):
"""In-order traversal (sorted order)."""
result = []
self._inorder_recursive(self.root, result)
return result
def _inorder_recursive(self, node, result):
if node:
self._inorder_recursive(node.left, result)
result.append(node.val)
self._inorder_recursive(node.right, result)
Time Complexity
| Operation | Average Case | Worst Case |
|---|---|---|
| Search | O(log n) | O(n) |
| Insert | O(log n) | O(n) |
| Delete | O(log n) | O(n) |
| Traversal | O(n) | O(n) |
The worst case occurs when the tree becomes degenerate (essentially a linked list), which happens with sorted input.
AVL Trees
AVL trees are self-balancing BSTs where the heights of left and right subtrees differ by at most 1. This ensures O(log n) operations in all cases.
Properties
- Balance factor: Height(left subtree) - Height(right subtree)
- Valid balance factor: -1, 0, or 1
- Rotations: Required when balance factor exceeds threshold
Implementation
class AVLNode:
def __init__(self, val=0):
self.val = val
self.left = None
self.right = None
self.height = 1
class AVLTree:
def get_height(self, node):
if not node:
return 0
return node.height
def get_balance(self, node):
if not node:
return 0
return self.get_height(node.left) - self.get_height(node.right)
def right_rotate(self, y):
"""Right rotation for left-heavy subtree."""
x = y.left
T2 = x.right
# Perform rotation
x.right = y
y.left = T2
# Update heights
y.height = 1 + max(self.get_height(y.left), self.get_height(y.right))
x.height = 1 + max(self.get_height(x.left), self.get_height(x.right))
return x
def left_rotate(self, x):
"""Left rotation for right-heavy subtree."""
y = x.right
T2 = y.left
# Perform rotation
y.left = x
x.right = T2
# Update heights
x.height = 1 + max(self.get_height(x.left), self.get_height(x.right))
y.height = 1 + max(self.get_height(y.left), self.get_height(y.right))
return y
def insert(self, root, val):
"""Insert with AVL balancing."""
# Standard BST insertion
if not root:
return AVLNode(val)
if val < root.val:
root.left = self.insert(root.left, val)
elif val > root.val:
root.right = self.insert(root.right, val)
else:
return root # No duplicates
# Update height
root.height = 1 + max(self.get_height(root.left), self.get_height(root.right))
# Get balance factor
balance = self.get_balance(root)
# Left Left Case
if balance > 1 and val < root.left.val:
return self.right_rotate(root)
# Right Right Case
if balance < -1 and val > root.right.val:
return self.left_rotate(root)
# Left Right Case
if balance > 1 and val > root.left.val:
root.left = self.left_rotate(root.left)
return self.right_rotate(root)
# Right Left Case
if balance < -1 and val < root.right.val:
root.right = self.right_rotate(root.right)
return self.left_rotate(root)
return root
def delete(self, root, val):
"""Delete with AVL balancing."""
if not root:
return root
# Standard BST deletion
if val < root.val:
root.left = self.delete(root.left, val)
elif val > root.val:
root.right = self.delete(root.right, val)
else:
if root.left is None:
return root.right
elif root.right is None:
return root.left
# Get inorder successor
temp = self._find_minimum(root.right)
root.val = temp.val
root.right = self.delete(root.right, temp.val)
if root is None:
return root
# Update height
root.height = 1 + max(self.get_height(root.left), self.get_height(root.right))
# Get balance factor
balance = self.get_balance(root)
# Left Left Case
if balance > 1 and self.get_balance(root.left) >= 0:
return self.right_rotate(root)
# Left Right Case
if balance > 1 and self.get_balance(root.left) < 0:
root.left = self.left_rotate(root.left)
return self.right_rotate(root)
# Right Right Case
if balance < -1 and self.get_balance(root.right) <= 0:
return self.left_rotate(root)
# Right Left Case
if balance < -1 and self.get_balance(root.right) > 0:
root.right = self.right_rotate(root.right)
return self.left_rotate(root)
return root
def _find_minimum(self, node):
while node.left:
node = node.left
return node
Rotation Types
Left-Left (LL) Case - Right Rotation:
y x
/ \ / \
x T4 => z y
/ \ / \
z T3 T1 T3
Right-Right (RR) Case - Left Rotation:
y x
/ \ / \
T1 x => y z
/ \ / \
T2 z T2 T1
Left-Right (LR) Case - Left then Right:
y y z
/ \ / \ / \
x T4 => z T4 => x y
/ \ / \ / \ / \
T1 z x T3 T1 T2 T3 T4
/ \ / \
T2 T3 T1 T2
Red-Black Trees
Red-Black trees ensure approximate balance through color properties rather than strict height difference.
Properties
- Every node is either red or black
- Root is always black
- Every leaf (NIL) is black
- Red node cannot have red children
- Every path from root to leaf has same number of black nodes
Implementation
class RBNode:
def __init__(self, val, color="RED"):
self.val = val
self.color = color
self.left = None
self.right = None
self.parent = None
class RedBlackTree:
def __init__(self):
self.NIL = RBNode(None, "BLACK")
self.NIL.left = self.NIL
self.NIL.right = self.NIL
self.root = self.NIL
def left_rotate(self, x):
y = x.right
x.right = y.left
if y.left != self.NIL:
y.left.parent = x
y.parent = x.parent
if x.parent == self.NIL:
self.root = y
elif x == x.parent.left:
x.parent.left = y
else:
x.parent.right = y
y.left = x
x.parent = y
def right_rotate(self, y):
x = y.left
y.left = x.right
if x.right != self.NIL:
x.right.parent = y
x.parent = y.parent
if y.parent == self.NIL:
self.root = x
elif y == y.parent.right:
y.parent.right = x
else:
y.parent.left = x
x.right = y
y.parent = x
def insert_fixup(self, k):
"""Fix violations after insertion."""
while k.parent.color == "RED":
if k.parent == k.parent.parent.left:
u = k.parent.parent.right # Uncle
if u.color == "RED":
# Case 1: Uncle is red
k.parent.color = "BLACK"
u.color = "BLACK"
k.parent.parent.color = "RED"
k = k.parent.parent
else:
if k == k.parent.right:
# Case 2: Uncle black, k is right child
k = k.parent
self.left_rotate(k)
# Case 3: Uncle black, k is left child
k.parent.color = "BLACK"
k.parent.parent.color = "RED"
self.right_rotate(k.parent.parent)
else:
# Mirror case
u = k.parent.parent.left
if u.color == "RED":
k.parent.color = "BLACK"
u.color = "BLACK"
k.parent.parent.color = "RED"
k = k.parent.parent
else:
if k == k.parent.left:
k = k.parent
self.right_rotate(k)
k.parent.color = "BLACK"
k.parent.parent.color = "RED"
self.left_rotate(k.parent.parent)
self.root.color = "BLACK"
def insert(self, val):
"""Insert a value into the Red-Black tree."""
z = RBNode(val)
z.left = self.NIL
z.right = self.NIL
y = self.NIL
x = self.root
while x != self.NIL:
y = x
if z.val < x.val:
x = x.left
else:
x = x.right
z.parent = y
if y == self.NIL:
self.root = z
elif z.val < y.val:
y.left = z
else:
y.right = z
self.insert_fixup(z)
Comparison: BST vs AVL vs Red-Black
| Property | BST | AVL | Red-Black |
|---|---|---|---|
| Balance | None | Strict (ยฑ1) | Approximate |
| Search | O(log n) avg | O(log n) | O(log n) |
| Insert | O(log n) avg | O(log n) | O(log n) |
| Delete | O(log n) avg | O(log n) | O(log n) |
| Rotations | None | 1-2 per op | 1-3 per op |
| Height | O(n) worst | 1.44*log(n) | 2*log(n) |
| Use case | Simple, infrequent rebalancing | Frequent lookups | Frequent insertions |
When to Use Which
Use BST When:
- Data is randomly distributed
- Occasional O(n) worst case is acceptable
- Simplicity is preferred
- Data is relatively static
Use AVL When:
- Search-heavy workloads
- Strict balance is critical
- Read operations dominate
- Insertions/deletions are infrequent
Use Red-Black When:
- Insertions and deletions are frequent
- Good balance with fewer rotations
- General-purpose balanced tree needed
- Used in standard library implementations (C++ std::map, Java TreeMap)
Practical Applications
# Use cases for different tree types
# 1. Database indexing (often B-trees, but concept similar)
# - Red-Black trees for in-memory indexes
# - Fast range queries and point lookups
# 2. Associative arrays (std::map in C++)
# - Red-Black tree implementation
# - O(log n) insert, delete, search
# 3. Priority queues (can be implemented with heaps, but BST works)
# - Order statistics
# - Find min/max quickly
# 4. Sets and unique collections
# - AVL or Red-Black trees
# - Maintains sorted order
Conclusion
Binary Search Trees provide efficient ordered operations, but their performance degrades with unbalanced data. AVL trees offer stricter balance with O(log n) guaranteed operations but require more rotations. Red-Black trees provide a good balance between performance and implementation complexity, making them the choice for many standard library implementations.
Understanding these data structures is essential for system design, database indexing, and solving algorithmic problems efficiently.
Comments