Skip to main content
โšก Calmops

Linux Networking Stack: Architecture, TCP/IP, and Network Interfaces

Introduction

The Linux networking stack represents one of the most sophisticated and feature-rich implementations of the TCP/IP protocol suite in existence. From its origins as a simple TCP/IP implementation for Minix in 1991, the Linux network stack has evolved into a production-grade subsystem powering billions of devices worldwide, from embedded systems to cloud infrastructure and supercomputers. Understanding how the Linux networking stack works is essential for system administrators, DevOps engineers, network engineers, and software developers who work with networked applications.

In 2026, the Linux networking stack continues to evolve, incorporating new protocols, performance optimizations, and security features. This article provides a comprehensive exploration of the Linux networking architecture, examining each layer of the protocol stack, the flow of packets through the kernel, network device drivers, and practical techniques for optimizing network performance.

Linux Networking Architecture Overview

The Linux networking stack implements a layered architecture that mirrors the OSI model but is optimized for the TCP/IP protocol suite. Understanding this architecture is fundamental to debugging network issues, optimizing performance, and implementing network features.

Protocol Layer Mapping

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    LINUX NETWORKING LAYERS                            โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚
โ”‚  โ”‚                    Application Layer                         โ”‚     โ”‚
โ”‚  โ”‚   HTTP, FTP, SSH, DNS, SMTP, WebSocket                      โ”‚     โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚
โ”‚                              โ”‚                                       โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚
โ”‚  โ”‚                    Transport Layer                           โ”‚     โ”‚
โ”‚  โ”‚         TCP (Connection-oriented)  |  UDP (Connectionless)  โ”‚     โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚
โ”‚                              โ”‚                                       โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚
โ”‚  โ”‚                    Network Layer                            โ”‚     โ”‚
โ”‚  โ”‚              IPv4  |  IPv6  |  ICMP  |  IGMP               โ”‚     โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚
โ”‚                              โ”‚                                       โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚
โ”‚  โ”‚                  Data Link Layer                            โ”‚     โ”‚
โ”‚  โ”‚         Ethernet  |  ARP  |  Bridging  |  VLANs            โ”‚     โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚
โ”‚                              โ”‚                                       โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚
โ”‚  โ”‚                    Physical Layer                           โ”‚     โ”‚
โ”‚  โ”‚           Network Device Drivers (NICs, WiFi)             โ”‚     โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚
โ”‚                                                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key Kernel Components

The Linux kernel networking code resides primarily in the net/ directory, with key components including:

# Explore kernel networking code structure
ls -la /usr/src/linux/net/
# Output:
# core/           - Core networking (sk_buff, sock, etc.)
# ipv4/           - IPv4 implementation
# ipv6/           - IPv6 implementation
# tcp_diag/       - TCP diagnostics
# wireless/       - Wireless (WiFi) drivers
# bridge/         - Ethernet bridging
# netfilter/      - Firewall (iptables, nftables)
# netlabel/       - NetLabel/SELinux labeling
# sctp/           - SCTP protocol
# dccp/           - Datagram Congestion Control
# 8021q/          - VLAN tagging
# vxlan/          - Virtual Extensible LAN
# tunnel/         - IP tunnels (GRE, IPIP)

Network Devices and Interfaces

Network interfaces are the entry and exit points for network traffic. Linux supports a rich variety of network device types, from physical hardware to virtual interfaces.

Viewing Network Interfaces

# List all network interfaces
ip link show
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT
#     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT
#     link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff

# Show interface details
ip -s link show eth0

# View interface statistics
ip -s link show
# RX: bytes packets errors dropped overrun mcast
# TX: bytes packets errors dropped carrier collsns

Interface Types

Linux supports numerous interface types beyond physical Ethernet:

# Virtual Ethernet (veth) - for containers/namespaces
ip link add veth0 type veth peer name veth1
ip netns add container1
ip link set veth1 netns container1

# Bridge - for virtualization/bridging
ip link add br0 type bridge
ip link set eth0 master br0

# VLAN subinterfaces
ip link add eth0.100 link eth0 type vlan id 100

# Bond/team - link aggregation
ip link add bond0 type bond mode active-backup
ip link set eth0 master bond0
ip link set eth1 master bond0

# Tunnel interfaces
ip link add tun0 mode tun
ip link add gre0 mode gre remote 10.0.0.1 local 10.0.0.2

# WireGuard VPN
ip link add wg0 type wireguard
wg setconf wg0 /etc/wireguard/wg0.conf

# VRF (Virtual Routing and Forwarding)
ip link add vrf-blue type vrf table 100
ip link set eth0 vrf blue

Packet Flow Through the Stack

Understanding how packets flow through the Linux networking stack is crucial for debugging and optimization. The journey differs based on whether packets are destined for local delivery or being forwarded.

Receive Path

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     PACKET RECEIVE PATH                               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                      โ”‚
โ”‚  1. NIC receives packet (DMA to ring buffer)                         โ”‚
โ”‚         โ”‚                                                           โ”‚
โ”‚         โ–ผ                                                           โ”‚
โ”‚  2. NAPI poll() called (interrupt mitigation)                       โ”‚
โ”‚         โ”‚                                                           โ”‚
โ”‚         โ–ผ                                                           โ”‚
โ”‚  3. sk_buff allocated, packet data copied                           โ”‚
โ”‚         โ”‚                                                           โ”‚
โ”‚         โ–ผ                                                           โ”‚
โ”‚  4. Netfilter PREROUTING (iptables raw/mangle)                      โ”‚
โ”‚         โ”‚                                                           โ”‚
โ”‚         โ–ผ                                                           โ”‚
โ”‚  5. Routing decision (routing table lookup)                         โ”‚
โ”‚         โ”‚                                                           โ”‚
โ”‚         โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                                         โ”‚
โ”‚         โ–ผ                  โ–ผ                                         โ”‚
โ”‚   Local Delivery      Forwarding                                    โ”‚
โ”‚         โ”‚                  โ”‚                                         โ”‚
โ”‚         โ–ผ                  โ–ผ                                         โ”‚
โ”‚  6. Netfilter       6. Netfilter                                   โ”‚
โ”‚    INPUT              FORWARD                                       โ”‚
โ”‚         โ”‚                  โ”‚                                         โ”‚
โ”‚         โ–ผ                  โ–ผ                                         โ”‚
โ”‚  7. Transport layer  7. Netfilter                                   โ”‚
โ”‚    (TCP/UDP)          POSTROUTING                                   โ”‚
โ”‚         โ”‚                                                           โ”‚
โ”‚         โ–ผ                                                           โ”‚
โ”‚  8. Socket receive buffer                                           โ”‚
โ”‚         โ”‚                                                           โ”‚
โ”‚         โ–ผ                                                           โ”‚
โ”‚  9. Application recv()                                               โ”‚
โ”‚                                                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Transmit Path

# View connection tracking stats
cat /proc/net/stat/nf_conntrack
# Or:
conntrack -L

# Monitor packet drops
ethtool -S eth0 | grep -i drop

# Check NIC statistics
ip -s link show eth0

# View socket buffer usage
ss -s
# TCP: inuse 12345 orphan 0 tw 1234 alloc 45678 mem 890

# Monitor network errors
netstat -i
# Kernel Interface table
# Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR   TX-OK TX-ERR TX-DRP TX-OVR Flg
# eth0       1500 0  1234567      0      0      0   987654      0      0      0 BMRU

TCP/IP Implementation

Linux provides a complete and highly tunable TCP/IP implementation. Understanding the state machine, congestion control algorithms, and buffer management is essential for network optimization.

TCP Connection Lifecycle

# View TCP connections
ss -tunapl

# Connection states:
# LISTEN    - Server waiting for connections
# ESTABLISHED - Active connection
# TIME_WAIT - Connection closed, waiting for late packets
# CLOSE_WAIT - Remote closed, local still has data to send
# SYN_SENT  - Client sent SYN
# SYN_RECV  - Server received SYN, sent SYN-ACK

# View TCP memory usage
cat /proc/net/tcp
cat /proc/net/tcp6

# TCP socket options (view with ss)
ss -ti
# Shows timer, congestion control, rtt, etc.

TCP Congestion Control

Linux supports multiple congestion control algorithms:

# List available algorithms
sysctl net.ipv4.tcp_available_congestion_control
# Output: cubic reno bbr

# View current algorithm
sysctl net.ipv4.tcp_congestion_control

# Switch to BBR (Bottleneck Bandwidth and RTT)
sysctl -w net.ipv4.tcp_congestion_control=bbr

# Make persistent
echo "net.ipv4.tcp_congestion_control=bbr" >> /etc/sysctl.conf

# View TCP statistics
cat /proc/net/netstat | head -2
# TcpExt: SyncookiesSent SyncookiesRecv SyncookiesFailed

UDP Implementation

# View UDP statistics
cat /proc/net/udp
cat /proc/net/udp6

# Monitor UDP with ss
ss -uap

# UDP buffer sizes
sysctl net.ipv4.udp_rmem_min
sysctl net.ipv4.udp_wmem_min

Socket Programming Basics

Applications interact with the network through sockets. Understanding socket creation, binding, and data transfer is fundamental to network application development.

Creating and Using Sockets

// Server socket creation (TCP)
int server_fd = socket(AF_INET, SOCK_STREAM, 0);
// AF_INET    - IPv4
// SOCK_STREAM - TCP (connection-oriented)
// 0          - Default protocol

// Set socket options
int opt = 1;
setsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));

// Bind to address
struct sockaddr_in address;
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;  // Any interface
address.sin_port = htons(8080);        // Port 8080
bind(server_fd, (struct sockaddr *)&address, sizeof(address));

// Listen for connections
listen(server_fd, 128);  // Backlog queue

// Accept connection
int client_fd = accept(server_fd, NULL, NULL);

// Read/Write
char buffer[1024];
read(client_fd, buffer, sizeof(buffer));
write(client_fd, response, strlen(response));

// Client socket creation (TCP)
int client_fd = socket(AF_INET, SOCK_STREAM, 0);
connect(client_fd, (struct sockaddr *)&server_addr, sizeof(server_addr));

Socket Options

# Common socket options with ss
ss -tlop 'sport = :80'
# Shows sockets listening on port 80 with options

# Important SO_* options:
# SO_REUSEADDR - Allow address reuse
# SO_REUSEPORT - Allow port reuse (load balancing)
# SO_KEEPALIVE - Enable TCP keepalive
# SO_SNDBUF / SO_RCVBUF - Buffer sizes
# SO_LINGER - Linger on close
# SO_MARK - Mark packets for fwmark

Routing

Linux implements a powerful routing subsystem that supports complex routing scenarios, including multiple routing tables, policy routing, and various routing protocols.

Routing Tables

# View main routing table
ip route show
# default via 192.168.1.1 dev eth0
# 192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.100

# View all tables
ip route show table all

# View specific table
ip route show table local
ip route show table main  # Same as default

# Add route
ip route add 10.0.0.0/24 via 192.168.1.1
ip route add default via 192.168.1.1

# Delete route
ip route del 10.0.0.0/24

# Add persistent route (Debian)
/etc/network/interfaces:
up ip route add 10.0.0.0/24 via 192.168.1.1

Policy Routing

Linux supports multiple routing tables based on packet attributes:

# Create additional routing table
echo "100 vpn" >> /etc/iproute2/rt_tables

# Add rule to use table based on source
ip rule add from 10.0.0.0/24 table vpn
ip rule add fwmark 100 table vpn

# Add routes to vpn table
ip route add default via 10.0.0.1 table vpn

# List rules
ip rule show
# 0:      from all lookup local
# 32766:  from all lookup main
# 32767:  from all lookup default
# 100:    from 10.0.0.0/24 lookup vpn

IP Virtual Server (LVS/IPVS)

Linux supports built-in load balancing:

# Load IPVS module
modprobe ip_vs
modprobe ip_vs_rr  # Round robin
modprobe ip_vs_wrr  # Weighted round robin
modprobe ip_vs_sh   # Source hash

# Add virtual service
ipvsadm -A -t 192.168.1.100:80 -s rr

# Add real server
ipvsadm -a -t 192.168.1.100:80 -r 10.0.0.1:80 -m
ipvsadm -a -t 192.168.1.100:80 -r 10.0.0.2:80 -m

# View IPVS connections
ipvsadm -L -n

Network Namespaces

Network namespaces provide isolation for network resources, enabling containerization and network virtualization.

Creating Namespaces

# Create network namespace
ip netns add container1

# List namespaces
ip netns list

# Execute command in namespace
ip netns exec container1 ip link
ip netns exec container1 ping 8.8.8.8

# Add interface to namespace
ip link set eth1 netns container1

# Delete namespace
ip netns del container1

Namespace Use Cases

# Create isolated network environment
ip netns add router
ip netns add left
ip netns add right

# Create veth pairs
ip link add veth-r-l type veth peer name veth-l-r
ip link add veth-r-right type veth peer name veth-right-r

# Move to namespaces
ip link set veth-l-r netns left
ip link set veth-r-right netns right

# Configure interfaces
ip netns exec left ip addr add 10.0.1.2/24 dev veth-l-r
ip netns exec right ip addr add 10.0.2.2/24 dev veth-right-r
ip netns exec router ip addr add 10.0.1.1/24 dev veth-r-l
ip netns exec router ip addr add 10.0.2.1/24 dev veth-r-right

Network Performance Optimization

Optimizing Linux network performance requires understanding both kernel parameters and hardware capabilities.

Tuning Network Buffers

# /etc/sysctl.d/99-network.conf

# TCP buffer sizes (auto-tuned but can be set)
net.ipv4.tcp_rmem = 4096 131072 6291456
net.ipv4.tcp_wmem = 4096 16384 6291456
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 262144
net.core.wmem_default = 262144

# Enable TCP window scaling
net.ipv4.tcp_window_scaling = 1

# TCP timestamps (accurate RTT measurement)
net.ipv4.tcp_timestamps = 1

# TCP SACK (selective acknowledgments)
net.ipv4.tcp_sack = 1

NIC Tuning

# Increase ring buffer (for high throughput)
ethtool -G eth0 rx 4096 tx 4096

# Enable interrupt coalescing
ethtool -C eth0 rx-usecs 100 tx-usecs 100

# Enable offloads
ethtool -K eth0 gro on gso on tso on

# Set channel count
ethtool -L eth0 combined 4

# View current settings
ethtool -g eth0  # Ring
ethtool -c eth0  # Coalesce
ethtool -k eth0  # Offloads

Connection Tracking

# Increase connection tracking table size
net.netfilter.nf_conntrack_max = 1048576
net.netfilter.nf_conntrack_tcp_timeout_established = 7200

# View connection tracking stats
cat /proc/sys/net/netfilter/nf_conntrack_count

# Tune timeouts
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 30
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 15

Network Monitoring and Debugging

Essential Monitoring Commands

# Real-time packet capture
tcpdump -i eth0 -n
tcpdump -i eth0 port 80
tcpdump -i eth0 host 192.168.1.1

# Connection statistics
ss -tunapl
netstat -antp
netstat -s

# Interface statistics
ip -s link show eth0
ethtool -S eth0

# Route debugging
ip route get 8.8.8.8
traceroute 8.8.8.8

# DNS debugging
dig example.com
nslookup example.com

BPF (Berkeley Packet Filter)

# Use bpftrace for custom tracing
# Install: apt install bpftrace

# Trace TCP connections
bpftrace -e 'kprobe:tcp_connect { printf("TCP connect: %s\n", comm); }'

# Trace packet drops
bpftrace -e 'kprobe:dev_queue_xmit { @[comm] = count(); }'

# Use perf for profiling
perf record -g -a -e net:netif_receive_skb -e net:netif_xmit
perf report

Netfilter and Packet Filtering

Linux includes a powerful packet filtering framework through Netfilter, providing stateful inspection, NAT, and packet modification.

iptables Basics

# List rules
iptables -L -n -v
iptables -t nat -L -n -v

# Allow established connections
iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

# Allow SSH (port 22)
iptables -A INPUT -p tcp --dport 22 -j ACCEPT

# Allow HTTP/HTTPS
iptables -A INPUT -p tcp --dport 80 -j ACCEPT
iptables -A INPUT -p tcp --dport 443 -j ACCEPT

# Drop everything else
iptables -A INPUT -j DROP

# NAT (Source NAT)
iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -o eth0 -j MASQUERADE

# Port forwarding
iptables -t nat -A PREROUTING -p tcp --dport 8080 -j REDIRECT --to-port 80

nftables (Modern Alternative)

# View rules
nft list ruleset

# Create table
nft add table ip filter

# Add chain
nft add chain ip filter input '{ policy drop; }'

# Add rule
nft add rule ip filter input ct state established,related accept
nft add rule ip filter input tcp dport 22 accept

# NAT
nft add table ip nat
nft add chain ip nat postrouting '{ policy accept; }'
nft add rule ip nat postrouting oifname "eth0" masquerade

Conclusion

The Linux networking stack is a remarkable piece of engineering, providing robust, scalable, and highly configurable networking capabilities that rival specialized network appliances. From the physical layer where network interface controllers interact with the kernel, through the complex TCP/IP implementations, to the application layer where sockets connect programs to the network, each component plays a crucial role in delivering network functionality.

Understanding the packet flow through the stack, from device driver to application socket and back, enables effective troubleshooting and optimization. The key concepts coveredโ€”network devices and interfaces, socket programming, routing, namespaces, and performance tuningโ€”form the foundation for working with Linux networking in production environments.

As network requirements continue to grow with cloud-native applications, containers, and high-speed networking, the Linux networking stack evolves to meet these challenges. Features like eBPF, modern congestion control algorithms like BBR, and improved hardware offloading ensure Linux remains at the forefront of networking technology. Whether you’re debugging a connection issue, optimizing a high-performance web server, or designing a container networking architecture, the knowledge of Linux networking fundamentals is invaluable.

Comments