Introduction
Monitoring is essential for maintaining healthy Linux systems. This guide covers monitoring tools and techniques.
Command Line Tools
Essential Commands
# System resource usage
top # Interactive process viewer
htop # Enhanced top (colorful)
atop # Advanced top
glances # Cross-platform monitoring
# CPU
mpstat -P ALL 1 # Per-CPU stats
sar -u 1 # CPU utilization
# Memory
free -h # Memory usage
vmstat 1 # Virtual memory stats
# Disk
df -h # Disk usage
iostat -x 1 # I/O statistics
du -sh * # Directory sizes
# Network
iftop # Network bandwidth
netstat -tuln # Listening ports
ss -tuln # Modern netstat
htop Customization
# Install htop
sudo apt install htop
# Custom htop config
# ~/.config/htop/htoprc
config:
show_cpu_usage: 1
show_cpu_frequency: 1
show_cpu_temperature: 1
show_memory_usage: 1
detailed_cpu_time: 1
columns:
- PID
- USER
- PRIORITY
- NICE
- M_SIZE
- M_RESIDENT
- STATE
- CPU
- MEM
- TIME
- Command
System Monitoring
SAR (System Activity Reporter)
# Install
sudo apt install sysstat
# Enable
sudo systemctl enable sysstat
sudo systemctl start sysstat
# CPU report
sar -u 1 5 # 5 reports, 1 second apart
sar -u -s 08:00 -e 12:00 # During specific hours
# Memory
sar -r 1
# Disk I/O
sar -d 1
# Network
sar -n DEV 1
Monitoring Stack
Prometheus + Grafana
# docker-compose.yml
version: '3'
services:
prometheus:
image: prom/prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
grafana:
image: grafana/grafana
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
Prometheus Configuration
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'myservice'
static_configs:
- targets: ['myservice:8080']
Alerting
Prometheus Rules
# alerts.yml
groups:
- name: example
rules:
- alert: HighCPUUsage
expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
- alert: HighMemoryUsage
expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 85
for: 5m
labels:
severity: warning
Conclusion
Monitoring is crucial for system reliability. Use these tools to stay informed about system health.
Comments