Smart Pointers in Rust - Reference Cycles

Rust’s memory safety guarantees are a cornerstone of the language, but there are some edge cases where memory leaks can still occur. One of the most common ways this can happen is by creating a reference cycle with reference-counted smart pointers like Rc<T>.

A reference cycle occurs when two or more Rc<T> smart pointers point to each other, creating a loop. In this situation, the reference count of each pointer will never drop to zero, because each item in the cycle is holding a reference to the next. As a result, the drop method is never called, and the memory allocated for these values is never deallocated. This is a memory leak.

Creating a Reference Cycle

Let’s demonstrate a reference cycle. We’ll use a List that can point to another List. To allow a list to be modified after creation (so we can create the cycle), we’ll use RefCell<T>.

use std::rc::Rc;
use std::cell::RefCell;

#[derive(Debug)]
enum List {
    // The second element is a RefCell to allow mutation, and an Rc to allow sharing.
    Cons(i32, RefCell<Rc<List>>),
    Nil,
}

use List::{Cons, Nil};

impl List {
    // A convenience method to get the tail of the list.
    fn tail(&self) -> Option<&RefCell<Rc<List>>> {
        match self {
            Cons(_, item) => Some(item),
            Nil => None,
        }
    }
}

fn main() {
    // Create list `a`: 5 -> Nil
    let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil))));

    println!("a initial rc count = {}", Rc::strong_count(&a)); // 1
    println!("a next item = {:?}", a.tail());

    // Create list `b`: 10 -> a
    let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a))));

    println!("a rc count after b creation = {}", Rc::strong_count(&a)); // 2
    println!("b initial rc count = {}", Rc::strong_count(&b)); // 1
    println!("b next item = {:?}", b.tail());

    // Create the cycle: a -> b
    if let Some(link) = a.tail() {
        *link.borrow_mut() = Rc::clone(&b);
    }

    println!("b rc count after changing a = {}", Rc::strong_count(&b)); // 2
    println!("a rc count after changing a = {}", Rc::strong_count(&a)); // 2

    // Uncommenting the next line will cause a stack overflow
    // because the lists point to each other, creating an infinite loop.
    // println!("a next item = {:?}", a.tail());
}

At the end of main, b goes out of scope, and its strong count decrements from 2 to 1. However, it doesn’t reach 0 because a still holds a reference to it. Then, a goes out of scope, and its strong count also decrements from 2 to 1, but not to 0, because b holds a reference to it. The memory for both a and b is leaked.

Preventing Reference Cycles with `Weak<T>`

To solve this problem, Rust provides the Weak<T> smart pointer. Weak<T> is a “weak” reference to a value owned by an Rc<T>. Unlike Rc<T>, a Weak<T> reference does not increase the strong reference count. It also doesn’t guarantee that the value it points to still exists.

Because the value might have been dropped, you must call the upgrade() method on a Weak<T> to get a valid Option<Rc<T>>. You’ll get Some(Rc<T>) if the value is still alive, and None if it has been deallocated.

Example: A Tree Data Structure

A common use case for Weak<T> is in tree-like structures where child nodes need to refer back to their parent. The parent should own its children (Rc<T>), but the children should not own their parent (Weak<T>).

use std::rc::{Rc, Weak};
use std::cell::RefCell;

#[derive(Debug)]
struct Node {
    value: i32,
    // Children are owned by the node.
    children: RefCell<Vec<Rc<Node>>>,
    // The parent is a weak reference to avoid cycles.
    parent: RefCell<Weak<Node>>,
}

fn main() {
    // Create a leaf node with no parent yet.
    let leaf = Rc::new(Node {
        value: 3,
        parent: RefCell::new(Weak::new()),
        children: RefCell::new(vec![]),
    });

    println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
    println!(
        "leaf strong = {}, weak = {}",
        Rc::strong_count(&leaf),
        Rc::weak_count(&leaf),
    );

    {
        // Create a branch node that will be the parent of leaf.
        let branch = Rc::new(Node {
            value: 5,
            parent: RefCell::new(Weak::new()),
            children: RefCell::new(vec![Rc::clone(&leaf)]),
        });

        // Set the parent of the leaf node.
        // We use `Rc::downgrade` to create a `Weak` reference.
        *leaf.parent.borrow_mut() = Rc::downgrade(&branch);

        println!(
            "branch strong = {}, weak = {}",
            Rc::strong_count(&branch),
            Rc::weak_count(&branch),
        );

        println!(
            "leaf strong = {}, weak = {}",
            Rc::strong_count(&leaf),
            Rc::weak_count(&leaf),
        );
    } // `branch` goes out of scope here. Its strong count becomes 0 and it is dropped.

    // The parent of `leaf` is now None.
    println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
    println!(
        "leaf strong = {}, weak = {}",
        Rc::strong_count(&leaf),
        Rc::weak_count(&leaf),
    );
}

In this corrected example:

branch owns leaf via Rc<Node>, so leaf has a strong count of 2 inside the inner scope.
leaf has a non-owning reference to branch via Weak<Node>. This does not increase branch’s strong count.
When branch goes out of scope, its strong count drops to 0, and it is deallocated.
The Weak reference in leaf now points to deallocated memory. Calling upgrade() on it returns None.
Finally, leaf goes out of scope, its strong count becomes 0, and it is also deallocated. No memory is leaked.

Conclusion

While Rc<T> is a useful tool for managing shared ownership, it’s crucial to be aware of the risk of reference cycles. By designing your data structures carefully and using Weak<T> to create non-owning references, you can prevent cycles and ensure that your program remains memory-safe.

Smart Pointers in Rust - Reference Cycles

Creating a Reference Cycle

Preventing Reference Cycles with Weak<T>

Example: A Tree Data Structure

Conclusion

Preventing Reference Cycles with `Weak<T>`