The Box Smart Pointer in Rust

In Rust, all values are allocated on the stack by default. The stack is fast and efficient, but it requires that the size of every value be known at compile time. What if you have data whose size is unknown or that you want to store on the heap? This is where Box<T>, Rust’s most straightforward smart pointer, comes in.

A Box<T> (pronounced “box of T”) is a smart pointer that owns data allocated on the heap. When the Box<T> goes out of scope, it is deallocated, and the heap memory it points to is freed. This provides heap allocation with Rust’s compile-time memory safety guarantees.

Why Use `Box<T>`?

There are three primary use cases for Box<T>:

Recursive Types: For types whose definition includes themselves, like a linked list. The size of such a type can’t be known at compile time without indirection.
Large Data: To transfer ownership of a large amount of data without copying it. Moving a Box<T> only copies the pointer on the stack, not the large data on the heap.
Trait Objects: To own a value and only care that it implements a specific trait, rather than knowing its concrete type.

Using `Box<T>` for Heap Allocation

Creating a Box is simple using Box::new().

fn main() {
    // Allocate an integer on the heap.
    // `b` is a Box<i32> on the stack, pointing to the value `5` on the heap.
    let b = Box::new(5); 
    
    println!("b = {}", b);

    // The Box is automatically deallocated when `b` goes out of scope.
}

Because Box<T> implements the Deref trait, you can treat it like a reference. The * operator follows the pointer to the heap data.

let x = 5;
let y = Box::new(x);

assert_eq!(5, x);
// We can dereference y to get the inner value.
assert_eq!(5, *y);

Use Case 1: Enabling Recursive Types

A classic example of a recursive type is a “cons list,” a data structure from Lisp. Let’s try to define one naively in Rust.

// This code will not compile!
enum List {
    Cons(i32, List),
    Nil,
}

The compiler will issue an error: recursive type 'List' has infinite size. The compiler can’t determine how much space to allocate for a List because List is part of its own definition.

We can solve this by using a Box<T> to add a layer of indirection. A Box has a known, fixed size (it’s just a pointer), so the compiler can determine the size of List.

// This compiles successfully.
enum List {
    Cons(i32, Box<List>),
    Nil,
}

use List::{Cons, Nil};

fn main() {
    // Create a list: 1 -> 2 -> 3 -> Nil
    let list = Cons(1, 
        Box::new(Cons(2, 
            Box::new(Cons(3, 
                Box::new(Nil)
            ))
        ))
    );
}

Here, Cons holds an i32 and a Box<List>. The Box points to the next List value on the heap, breaking the infinite recursion.

Use Case 2: Transferring Ownership of Large Data

If you have a large struct, moving it between functions can be expensive because all of its data is copied on the stack.

struct HugeData {
    // Imagine this is megabytes of data
    data: [u8; 1_000_000],
}

fn take_ownership(data: HugeData) {
    // Do something with the data
    println!("Took ownership of huge data.");
}

fn main() {
    let huge_data = HugeData { data: [0; 1_000_000] };
    // This is a potentially expensive copy on the stack.
    take_ownership(huge_data); 
}

By boxing the value, you allocate it on the heap once. When you transfer ownership, you’re only copying the small pointer on the stack, which is much faster.

struct HugeData {
    data: [u8; 1_000_000],
}

fn take_ownership_of_box(data: Box<HugeData>) {
    println!("Took ownership of boxed huge data.");
}

fn main() {
    let huge_data = Box::new(HugeData { data: [0; 1_000_000] });
    // This is a cheap move of the pointer.
    take_ownership_of_box(huge_data);
}

Conclusion

Box<T> is a fundamental tool in Rust for managing heap memory. It provides a simple way to allocate data on the heap, enabling patterns like recursive types and efficient transfer of large data, all while upholding Rust’s strict ownership and memory safety rules.