Skip to main content
โšก Calmops

BPF and eBPF: Extended Berkeley Packet Filter Programming

Introduction

BPF (Berkeley Packet Filter) has evolved from a simple packet filtering mechanism into a revolutionary technology that allows safe, efficient execution of user-defined code within the Linux kernel. Originally designed for network packet filtering in the 1990s, extended BPF (eBPF) has transformed into a general-purpose execution framework enabling networking, observability, security, and performance applications impossible with traditional kernel modules.

In 2026, eBPF powers production systems at major tech companies, enabling tools like Cilium for container networking, Falco for security monitoring, and various observability platforms. Understanding eBPF opens doors to advanced Linux system programming, performance optimization, and building next-generation infrastructure tools.

This comprehensive guide covers BPF fundamentals, programming concepts, practical applications, and tools for leveraging this transformative technology.

Understanding BPF Architecture

From Classic BPF to eBPF

Classic BPF (cBPF) emerged in 1997 as a mechanism for efficient packet filtering without copying entire packets to userspace. Programs written in BPF bytecode were compiled by userspace tools and loaded into the kernel, where a Just-In-Time (JIT) compiler translated them to native instructions.

eBPF, introduced in 2014 with Linux 3.18, extended this concept significantly:

  • Larger instruction set: 64-bit operations, function calls, loops
  • More hook points: Network, kernel functions, tracepoints, cgroups
  • Helper functions: Kernel functions callable from BPF programs
  • Maps: Efficient key-value storage for state sharing
  • Safety verification: Programs verified before execution
  • Just-in-time compilation: Native code generation

How eBPF Works

eBPF programs follow a lifecycle:

  1. Write: Developer writes program in C, Rust, or Go
  2. Compile: LLVM/clang compiles to BPF bytecode (.o file)
  3. Load: bpf() system call loads program into kernel
  4. Verify: Kernel safety checker validates program
  5. JIT: Just-in-time compiler translates to native code
  6. Attach: Program attached to hook point
  7. Execute: Kernel runs program when hook triggers

The verifier ensures programs:

  • Don’t crash the kernel
  • Don’t loop infinitely
  • Don’t access invalid memory
  • Have bounded execution time
  • Are properly privileged

Key Components

Component Purpose
BPF VM Virtual machine interpreting BPF bytecode
JIT Compiler Translates BPF to native instructions
Maps Kernel-user space data sharing
Helpers Kernel functions available to BPF
Ring Buffer Efficient event delivery
CO-RE Compile-Once-Run-Everywhere portability

BPF Hook Points

eBPF programs attach to various kernel locations:

Tracepoints

Static instrumentation points in kernel code:

# List available tracepoints
sudo ls /sys/kernel/debug/tracing/events/

# Common tracepoints
# sys_enter_* - system call entry
# sys_exit_* - system call exit
# block_* - block layer events
# net_* - network events
# sched_* - scheduler events

kprobes/kretprobes

Dynamic kernel function instrumentation:

# Attach to __ext4_alloc_blocks
sudo bpftrace -e 'kprobe:__ext4_alloc_blocks { printf("alloc blocks\n"); }'

Network Hooks

XDP (eXpress Data Path), socket, and cgroup hooks:

# XDP hook - earliest packet processing
# tc (traffic control) - qdisc and classification
# socket - connection tracking
# cgroup - resource management

Fentry/Fexit

Modern function entry/exit instrumentation with pt_regs access:

# Using bpftrace
sudo bpftrace -e 'fentry:do_nanosleep { printf("sleeping\n"); }'

Programming eBPF

Using bpftrace

The easiest way to write eBPF programs:

# Install bpftrace
sudo apt install bpftrace

# Simple example - count syscalls
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_* { @[comm] = count(); }'

# Trace file opens
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_open { printf("%s %s\n", comm, str(args->filename)); }'

# Measure block I/O latency
sudo bpftrace -e 'kprobe:blk_mq_start_request { @start[tid] = nsecs; }
    kprobe:blk_mq_complete_request { 
        @ = hist(nsecs - @start[tid]); 
        delete(@start[tid]); 
    }'

Using BCC Tools

The BPF Compiler Collection provides ready-to-use tools:

# Install BCC
sudo apt install bpfcc-tools

# Network analysis
sudo /usr/share/bcc/examples/networking/simple_tc.py

# CPU profiling
sudo /usr/sbin/offcputime -d 10

# Block I/O analysis
sudo /usr/sbin/biostack 1

# File system latency
sudo /usr/sbin/fileslower 1

Common BCC Tools

# execsnoop - trace exec() calls
sudo execsnoop

# opensnoop - trace file opens
sudo opensnoop

# tcpconnect - trace TCP connections
sudo tcpconnect

# tcpaccept - trace TCP accepts
sudo tcpaccept

# biosnoop - trace block I/O
sudo biosnoop

# cachestat - page cache statistics
sudo cachestat

# runqlat - scheduler run queue latency
sudo runqlat

Practical eBPF Applications

Network Filtering with XDP

XDP (eXpress Data Path) processes packets before full kernel stack:

// xdp_drop.c - Simple XDP packet dropper
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>
#include <linux/if_ether.h>
#include <linux/ip.h>
#include <linux/tcp.h>

SEC("xdp_drop")
int xdp_drop_prog(struct xdp_md *ctx) {
    void *data = (void *)(long)ctx->data;
    void *data_end = (void *)(long)ctx->data_end;
    
    struct ethhdr *eth = data;
    if (data + sizeof(*eth) > data_end)
        return XDP_DROP;
    
    if (eth->h_proto == bpf_htons(ETH_P_IP)) {
        struct iphdr *ip = data + sizeof(*eth);
        if ((void *)(ip + 1) > data_end)
            return XDP_DROP;
        
        // Drop TCP packets to port 80
        if (ip->protocol == IPPROTO_TCP) {
            struct tcphdr *tcp = (void *)ip + sizeof(*ip);
            if ((void *)(tcp + 1) > data_end)
                return XDP_DROP;
            
            if (bpf_ntohs(tcp->dest) == 80)
                return XDP_DROP;
        }
    }
    
    return XDP_PASS;
}

char _license[] SEC("license") = "GPL";

Load and attach:

# Compile
clang -O2 -Wall -target bpf -c xdp_drop.c -o xdp_drop.bpf.o

# Load with ip
sudo ip link set dev eth0 xdp obj xdp_drop.bpf.o sec xdp_drop

# Check
sudo ip link show eth0

# Remove
sudo ip link set dev eth0 xdp off

Observability with kprobes

Trace kernel functions:

// read_latency.c - Measure read latency
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <linux/fs.h>

struct key_t {
    char name[32];
};

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __type(key, u64);
    __type(value, u64);
    __uint(max_entries, 10000);
} start SEC(".maps");

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __type(key, struct key_t);
    __type(value, u64);
    __uint(max_entries, 10000);
} lat SEC(".maps");

SEC("kprobe/vfs_read")
int trace_vfs_read(struct pt_regs *ctx) {
    u64 pid = bpf_get_current_pid_tgid();
    u64 ts = bpf_ktime_get_ns();
    bpf_map_update_elem(&start, &pid, &ts, BPF_ANY);
    return 0;
}

SEC("kretprobe/vfs_read")
int trace_vfs_read_ret(struct pt_regs *ctx) {
    u64 pid = bpf_get_current_pid_tgid();
    u64 *tsp = bpf_map_lookup_elem(&start, &pid);
    if (!tsp)
        return 0;
    
    struct key_t key = {};
    bpf_get_current_comm(&key.name, sizeof(key.name));
    
    u64 latency = bpf_ktime_get_ns() - *tsp;
    u64 *val = bpf_map_lookup_elem(&lat, &key);
    if (val)
        *val += latency;
    else
        bpf_map_update_elem(&lat, &key, &latency, BPF_ANY);
    
    bpf_map_delete_elem(&start, &pid);
    return 0;
}

Security Monitoring

Detect suspicious activity:

// exec_detect.bpf.c - Detect execve calls
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <linux/sched.h>

struct event {
    char comm[TASK_COMM_LEN];
    char filename[256];
};

struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 1 << 20);
} events SEC(".maps");

SEC("tracepoint/syscalls/sys_enter_execve")
int trace_execve(struct trace_event_raw_sys_enter *ctx) {
    struct event *event;
    event = bpf_ringbuf_reserve(&events, sizeof(*event), 0);
    if (!event)
        return 0;
    
    bpf_get_current_comm(event->comm, sizeof(event->comm));
    bpf_probe_read_user(event->filename, sizeof(event->filename), 
                        (char *)ctx->args[0]);
    
    bpf_ringbuf_submit(event, 0);
    return 0;
}

Using BPF Maps

Map Types

BPF maps provide kernel-user space communication:

// Array map
struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __type(key, int);
    __type(value, u64);
    __uint(max_entries, 256);
} counts SEC(".maps");

// Hash map
struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __type(key, pid_t);
    __type(value, u64);
    __uint(max_entries, 1024);
} pid_stats SEC(".maps");

// Per-CPU array
struct {
    __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
    __type(key, int);
    __type(value, u64);
    __uint(max_entries, 64);
} cpu_stats SEC(".maps");

// Ring buffer (Linux 5.8+)
struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 1 << 20);
} events SEC(".maps");

Accessing Maps from Userspace

#!/usr/bin/env python3
import bpf

# Load BPF program
b = bpf.BPF(src_file="example.c")
b.attach_kprobe(event="__x64_sys_execve", fn_name="trace_execve")

# Access map
while True:
    try:
        for k, v in b["counts"].items():
            print(f"Index {k.value}: {v.value}")
    except:
        pass
    time.sleep(1)

Performance Analysis Tools

bpftrace One-Liners

# Count system calls by process
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_* { @[comm] = count(); }'

# Trace disk I/O
sudo bpftrace -e 'block:block_rq_insert { @[comm] = count(); }'

# Measure TCP connection time
sudo bpftrace -e 'tracepoint:inet:inet_sock_set_state { 
    if (args->newstate == TCP_ESTABLISHED) {
        @start[args->skaddr] = nsecs;
    }
    if (args->oldstate == TCP_ESTABLISHED && args->newstate == TCP_CLOSE) {
        @ = hist(nsecs - @start[args->skaddr]);
        delete(@start, args->skaddr);
    }
}'

# Page faults
sudo bpftrace -e 'tpoint:major_faults { @[comm] = count(); }'

# Scheduler latency
sudo bpftrace -e 'sched:sched_wakeup { @ts = nsecs; }
    sched:sched_switch /@ts/ { 
        @lat = hist(nsecs - @ts); 
        delete(@ts); 
    }'

Custom Tool Development

Build custom observability tools:

#!/usr/bin/env python3
import bpf
import ctypes as ct

class Event(ct.Structure):
    _fields_ = [
        ("comm", ct.c_char * 16),
        ("filename", ct.c_char * 256),
    ]

bpf_text = """
#include <linux/sched.h>

struct event {
    char comm[16];
    char filename[256];
};

BPF_RINGBUF(events, struct event);

TRACEPOINT_PROBE(syscalls, sys_enter_openat) {
    struct event *event = events.ringbuf_reserve(sizeof(struct event));
    if (!event)
        return 0;
    
    bpf_get_current_comm(event->comm, sizeof(event->comm));
    bpf_probe_read_user(event->filename, sizeof(event->filename),
                        args->filename);
    
    events.ringbuf_submit(event, 0);
    return 0;
}
"""

b = bpf.BPF(text=bpf_text)
ringbuf = b["events"]

print("Tracing openat()... Hit Ctrl-C to end")

while True:
    cpu = -1
    data = ringbuf.ringbuf_consume()
    for e in data:
        event = ct.cast(e.data, ct.POINTER(Event)).contents
        print(f"{event.comm.decode()}: {event.filename.decode()}")

BPF in Production

Cilium for Networking

Cilium uses eBPF for container networking:

# Install Cilium
curl -LO https://github.com/cilium/cilium-cli/latest/stable/cilium-linux-amd64.tar.gz
sudo tar xzvf cilium-linux-amd64.tar.gz -C /usr/local/bin

# Initialize
cilium install

# View eBPF maps
cilium bpf mesh list

Falco for Security

Falco uses eBPF for runtime security:

# Install Falco
curl -s https://falco.org/repo/falcosecurity-3672BA8D.asc | sudo apt-key add -
echo "deb https://download.falco.org/packages/deb stable main" | \
    sudo tee /etc/apt/sources.list.d/falcosecurity.list
sudo apt install falco

# Run with eBPF
sudo falco --ebpf

Performance Monitoring

Modern observability uses eBPF:

# Use Pixie (CNCF)
curl -s https://install.px.dev | sudo sh
px version

# Use Parca
parca-agent --help

Troubleshooting BPF

Common Issues

Program not loading:

# Check dmesg for errors
dmesg | grep -i bpf

# Verify kernel support
cat /proc/sys/kernel/bpf_stats_enabled
ls /sys/kernel/debug/tracing/events/

# Check permissions
ls -la /sys/fs/bpf/

Verification failures:

# Dmesg shows verifier errors
dmesg | tail | grep -i bpf

# Common issues:
# - Unbounded loops
# - Invalid memory access
# - Missing error handling
# - Stack overflow

Debugging Tools

# List loaded programs
bpftool prog list

# Dump program details
bpftool prog show id <id>

# List maps
bpftool map list

# View map contents
bpftool map dump id <id>

# Tracepoint availability
ls /sys/kernel/debug/tracing/events/

# perf analysis
perf record -e bpf:* -a
perf script

Best Practices

Development

  • Start with bpftrace for exploration
  • Use BCC for production tools
  • Consider libbpf for minimal dependencies
  • Test thoroughly in development
  • Handle errors explicitly in code
  • Use ring buffers for high-frequency events
  • Limit map sizes appropriately

Security

  • Never trust BPF program input
  • Validate all data from userspace
  • Use proper permissions (CAP_BPF)
  • Audit BPF programs in production
  • Monitor BPF system calls
  • Restrict via seccomp

Performance

  • Use per-CPU maps for concurrency
  • Prefer ring buffers over perf events
  • Minimize data copying
  • Batch operations where possible
  • Profile with bpftool
  • Test at scale before production

Conclusion

eBPF represents a fundamental shift in how we extend and observe Linux systems. By enabling safe, efficient kernel execution without kernel module development, eBPF democratizes kernel programming and enables rapid innovation in networking, observability, and security.

From simple tracing one-liners to complex production systems like Cilium, eBPF’s versatility makes it an essential technology for modern Linux professionals. As tooling matures and adoption grows, eBPF will increasingly power the infrastructure running cloud-native applications.

Start with bpftrace for exploration, graduate to BCC or libbpf for production tools, and consider eBPF when building next-generation systems.

Resources

Comments