Skip to main content
⚡ Calmops

Operating System Development with Rust Complete Guide 2026

Introduction

Rust has emerged as the language of choice for modern operating system development, offering memory safety without garbage collection and zero-cost abstractions that rival C performance. From Redox OS to Linux kernel modules, Rust is transforming how we build systems software. This comprehensive guide takes you through OS development with Rust, from understanding why Rust is ideal for systems programming through building your own minimal operating system.

The appeal of Rust for OS development stems from its unique combination of properties. Memory safety guarantees eliminate entire classes of bugs that have plagued C-based operating systems for decades—buffer overflows, use-after-free vulnerabilities, and data races become impossible to express in safe Rust. Yet Rust achieves this without garbage collection, making real-time and low-latency applications practical. The borrow checker, while demanding initially, enforces memory discipline that experienced C developers often maintain informally but now becomes enforced by the compiler.

This guide assumes familiarity with Rust fundamentals and builds toward practical OS development. We’ll cover the special considerations required for bare-metal programming, the abstractions Rust provides for hardware interaction, and the techniques that enable building production-quality systems software. By the end, you’ll understand both the theoretical foundations and practical implementation details needed to begin your own OS projects.

Why Rust for Operating Systems

Memory Safety Without Compromise

Operating system kernels operate at the boundary between hardware and software, requiring direct memory access that makes memory safety bugs catastrophic. A single buffer overflow in kernel code can compromise the entire system. Rust’s ownership model eliminates these bugs at compile time while generating code as efficient as C.

The borrow checker enforces rules that prevent dangling pointers and data races. When you transfer ownership of data, the previous owner becomes invalid—no references can remain. This eliminates use-after-free bugs entirely. The compiler also prevents buffer overflows by tracking array sizes and ensuring all accesses remain within bounds. These checks happen at compile time, imposing no runtime overhead.

Unsafe Rust exists for the necessary interactions with hardware and raw pointers, but its use is explicitly marked and contained. This pattern—safe abstractions over unsafe foundations—enables auditing. You can review all unsafe code in your kernel, knowing that safe Rust cannot introduce memory safety vulnerabilities. This contrasts with C, where any pointer manipulation anywhere in the codebase could cause memory corruption.

Zero-Cost Abstractions

Rust provides high-level abstractions that compile away to optimal machine code. Iterators, Option and Result types, and trait-based polymorphism all generate code equivalent to handwritten low-level implementations. This enables writing expressive, maintainable code without sacrificing performance.

Consider the iterator pattern. In C, you might write explicit loops with index management. In Rust, iterating over a slice compiles to identical assembly—the iterator abstraction is zero-cost. This enables writing clear, composable code that is also optimal. The standard library traits like IntoIterator and FromIterator provide consistent interfaces that optimize automatically.

The trait system enables generic programming with static dispatch by default. Monomorphization—generating specialized code for each concrete type—produces code as efficient as manually specialized implementations. When dynamic dispatch is needed, trait objects provide that option. This flexibility lets you choose the appropriate tradeoff for each use case.

Getting Started: Bare-Metal Rust

The No_std Environment

Standard Rust assumes an operating system providing standard library facilities—allocation, I/O, threading. For OS development, we use #![no_std], which excludes the standard library. We’re responsible for providing everything: memory allocation, panic handling, and all system interactions.

The core library provides types like Option, Result, and primitive slices—everything that doesn’t require OS services. This minimal foundation is sufficient for boot code and early kernel initialization. As we build more functionality, we’ll re-implement standard library features we need or use third-party no_std crates.

A minimal no_std crate requires only a entry point and panic handler:

#![no_std]
#![no_main]

use core::panic::PanicMessage;

#[panic_handler]
fn panic(_info: &PanicMessage) -> ! {
    loop {}
}

#[no_mangle]
pub extern "C" fn _start() -> ! {
    loop {}
}

The _start function serves as the entry point—whatever the linker places at the reset vector. For our initial exploration, we’ll use a QEMU-based development environment that treats our code as a simple payload.

Cross-Compilation Setup

OS development requires cross-compilation—we compile on our development machine but run on different hardware (or an emulator). Rust makes this straightforward with target specification.

For a bare-metal ARM64 target, we’d install the appropriate target:

rustup target add aarch64-unknown-none

The “none” variant indicates no OS—we’re on bare metal. Compilation uses this target:

cargo build --target aarch64-unknown-none --release

The resulting binary contains our code but no runtime or standard library. We’ll need to link it appropriately and load it onto our target hardware or emulator.

QEMU provides excellent emulation for development. A generic ARM64 virt machine works well:

qemu-system-aarch64 -machine virt -cpu cortex-a72 \
    -kernel target/aarch64-unknown-none/release/my_os \
    -nographic

This setup lets us develop and test without actual hardware, iterating quickly.

Memory Management

Physical Memory Management

Even before virtual memory, we must manage physical memory—deciding which physical pages are used for what purposes. Early boot code works with physical addresses directly, before page tables enable virtual addressing.

A simple bitmap-based allocator marks pages as used or free. This approach is simple but wastes CPU time scanning for free pages. More sophisticated allocators use free lists, buddy systems, or slab allocators. We’ll start simple and evolve as needed:

pub struct BitmapAllocator {
    bitmap: &'static mut [u64],
    max_pages: usize,
}

impl BitmapAllocator {
    pub fn allocate(&mut self) -> Option<PhysAddr> {
        for (chunk_idx, chunk) in self.bitmap.iter().enumerate() {
            if *chunk != u64::MAX {
                let free_bit = chunk.trailing_ones() as usize;
                self.bitmap[chunk_idx] |= 1 << free_bit;
                let page_idx = chunk_idx * 64 + free_bit;
                return Some(PhysAddr(page_idx * PAGE_SIZE));
            }
        }
        None
    }
}

Physical memory management becomes more complex as we initialize more subsystems. Keeping allocation centralized helps maintain invariants—we know all allocations come from our allocator, making debugging easier.

Virtual Memory and Page Tables

Virtual memory enables address space isolation, protection, and overcommit. Page tables map virtual addresses to physical addresses with fine-grained control over permissions. Implementing this is a milestone in OS development.

Modern architectures use multi-level page tables that trade traversal time for space efficiency. A 4-level page table on ARM64 maps 48-bit virtual addresses—256TB of address space—using 4KB pages. Each level of the table has 512 entries, each pointing to the next level or to a page.

Creating page tables requires careful code:

pub fn create_page_table() -> &'static mut PageTable {
    let pml4 = allocate_page();
    let table = unsafe { &mut *(pml4.as_ptr() as *mut PageTable) };
    
    for entry in table.entries.iter_mut() {
        entry.set_unused();
    }
    
    table
}

impl PageTable {
    pub fn map(&mut self, virt: VirtAddr, phys: PhysAddr, flags: PageFlags) {
        let pml4_idx = virt.pml4_index();
        let pdpt_idx = virt.pdpt_index();
        let pd_idx = virt.pd_index();
        let pt_idx = virt.pt_index();
        
        // Create intermediate tables as needed
        // ... set up each level ...
    }
}

The complexity of page table management makes this one of the more challenging parts of OS development. Many existing projects provide reference implementations—studying production kernels helps understand the patterns.

Process and Thread Management

Process Abstraction

Processes provide address space isolation—the foundation of multitasking. Each process has its own page tables, its own set of open files, and its own execution context. The kernel manages switching between processes, saving and restoring state.

In our Rust kernel, a process structure might contain:

pub struct Process {
    pub pid: Pid,
    pub page_table: PageTable,
    pub state: ProcessState,
    pub context: Context,
    pub address_space: AddressSpace,
    pub file_table: FileDescriptorTable,
    pub signals: SignalHandler,
}

The context switch—moving from one process to another—involves saving registers, switching page tables, and restoring the new process’s registers. On ARM64, this uses the process state save and restore mechanisms. The Rust challenge is expressing this low-level manipulation safely while remaining readable.

Thread Implementation

Threads share their address space with other threads in the same process, enabling parallel execution within a single address space. Threads require less state to switch than processes—they share memory and file descriptors—but still need separate stacks and registers.

A thread control block contains:

pub struct Thread {
    pub tid: Tid,
    pub process: Arc<Process>,
    pub stack: VirtAddr,
    pub state: ThreadState,
    pub context: ThreadContext,
    pub local_storage: BTreeMap<usize, usize>,
}

Thread scheduling determines which thread runs when. Simple schedulers use round-robin or priority queues. More sophisticated designs consider fairness, real-time constraints, and cache affinity. The scheduler design significantly impacts system responsiveness and throughput.

The scheduler interacts with timer interrupts to implement preemptive multitasking. Without preemption, a single thread could monopolize CPU time. The timer interrupt handler saves the current thread’s state and invokes the scheduler to select the next thread.

Concurrency in the Kernel

Implementing Synchronization Primitives

The kernel must synchronize access to shared data. Unlike userspace, where synchronization failures affect only one process, kernel synchronization bugs can crash the entire system. Rust’s ownership model helps, but the kernel has unique requirements.

Spinlocks provide the simplest synchronization:

pub struct Spinlock<T> {
    lock: UnsafeCell<SpinLock>,
    data: UnsafeCell<T>,
}

unsafe impl<T> Send for Spinlock<T> {}
unsafe impl<T> Sync for Spinlock<T> {}

impl<T> Spinlock<T> {
    pub fn lock(&self) -> Guard<T> {
        while self.lock.is_locked() {
            core::hint::spin_loop();
        }
        self.lock.set_locked();
        Guard { lock: &self }
    }
}

The critical section—where we hold the lock—must be as short as possible. Kernel code often reorganizes to minimize lock hold time, moving less time-sensitive operations outside the lock.

Mutexes add blocking semantics—threads that cannot acquire the lock sleep rather than spin. This wastes CPU cycles on spin-waiting but is appropriate when lock hold times might be long. The implementation uses atomic operations to manage the sleep queue.

Lock-Free Data Structures

For high-contention scenarios, lock-free data structures can improve performance. These use atomic operations to achieve synchronization without locks, enabling multiple threads to proceed simultaneously.

A simple lock-free stack uses compare-and-swap (CAS):

pub struct LockFreeStack<T> {
    head: AtomicPtr<Node<T>>,
}

impl<T> LockFreeStack<T> {
    pub fn push(&self, value: T) {
        let node = Box::into_raw(Box::new(Node::new(value)));
        let mut current = self.head.load(Ordering::Relaxed);
        loop {
            unsafe { (*node).next = current; }
            if self.head.compare_and_set(current, node, Ordering::Release) {
                break;
            }
            current = self.head.load(Ordering::Relaxed);
        }
    }
}

Lock-free structures are complex to implement correctly and significantly harder to verify than lock-based alternatives. They provide real benefits in high-throughput scenarios but should be used judiciously.

Device Drivers in Rust

MMIO and Device Access

Devices are accessed through Memory-Mapped I/O (MMIO)—reading and writing to specific physical addresses that route to hardware registers. Rust enables safe access through volatile reads and writes that prevent compiler optimizations from reordering or eliding accesses.

pub struct DeviceRegister {
    addr: usize,
}

impl DeviceRegister {
    pub fn read(&self) -> u32 {
        unsafe { ptr::read_volatile(self.addr as *const u32) }
    }
    
    pub fn write(&self, value: u32) {
        unsafe { ptr::write_volatile(self.addr as *mut u32, value) }
    }
}

Device drivers combine MMIO registers with the device’s protocol—understanding what each register means and in what sequence to access them. This requires hardware documentation and careful implementation.

Driver Architecture

Linux-style drivers use a modular architecture with standardized interfaces. In our kernel, drivers might implement traits that the kernel uses generically:

pub trait Driver {
    fn name(&self) -> &str;
    fn init(&mut self) -> Result<()>;
    fn handle_interrupt(&mut self, irq: u32);
}

pub trait BlockDevice {
    fn read_sectors(&mut self, start: u64, count: usize, buf: &mut [u8]) -> Result<()>;
    fn write_sectors(&mut self, start: u64, count: usize, buf: &[u8]) -> Result<()>;
}

Driver registration makes devices available:

pub fn register_driver<D: Driver + 'static>(driver: D) {
    DRIVERS.lock().push(Box::new(driver));
}

The driver init sequence iterates registered drivers, calling their init functions and setting up interrupt handlers.

Userspace and System Calls

Building the C Runtime

Supporting userspace applications requires implementing the C runtime—the foundation that lets programs run. This includes startup code, memory management for user allocations, and the system call interface.

The startup code runs before main:

_start:
    # Set up stack
    mov sp, #USER_STACK_TOP
    
    # Zero BSS
    ldr x0, =__bss_start__
    ldr x1, =__bss_end__
    sub x1, x1, x0
    bl memset
    
    # Call constructors
    ldr x0, __init_array_start__
    ldr x1, __init_array_end__
    b .init_loop
    
    # Call main
    bl main
    
    # Exit
    mov x0, #0
    bl exit

This assembly initializes the C runtime—setting up the stack, zeroing BSS, running constructors—then calls main. After main returns, it calls exit with the return code.

System Call Interface

System calls transition from user to kernel mode, providing controlled access to kernel services. The interface must be secure—user code cannot directly manipulate kernel data structures—and efficient.

A system call typically involves:

  1. Placing arguments in designated registers
  2. Transitioning to kernel mode (syscall instruction on ARM64/x86-64)
  3. The kernel handling the request
  4. Returning to userspace with result

Rust can express this elegantly:

pub unsafe fn syscall(num: SyscallNumber, args: [usize; 6]) -> Result<usize> {
    let result;
    asm!(
        "syscall",
        in("x16") num.as_usize(),
        in("x0") args[0],
        in("x1") args[1],
        in("x2") args[2],
        in("x3") args[3],
        in("x4") args[4],
        in("x5") args[5],
        lateout("x0") result
    );
    Result::from_raw(result)
}

Each system call number identifies a specific kernel function. The kernel dispatches based on this number, validates arguments, performs the operation, and returns.

Conclusion

Operating system development with Rust represents an exciting frontier in systems programming. The language’s safety guarantees eliminate entire categories of bugs while zero-cost abstractions enable high-performance code. From boot through userspace, Rust provides appropriate abstractions for each layer.

The path from here involves extending our minimal kernel into a full operating system—adding networking, a filesystem, and userspace utilities. Many projects provide reference implementations to study: Redox OS, Theseus, and Ruxos offer different design philosophies in Rust. Each demonstrates how the concepts in this guide apply in production systems.

The skills developed in OS development—understanding memory, concurrency, and hardware abstraction—transfer to other systems programming domains. Kernel development experience is valuable for embedded systems, security research, and performance-critical applications. And the discipline of writing correct, safe Rust code pays dividends in any systems software work.


Resources

Comments