Performance Testing and Benchmarking: k6, Artillery, and Autocannon

Introduction

Performance testing answers questions your unit tests can’t: “How many concurrent users can my API handle?” and “Where does it break?” Without load testing, you discover your limits in production — the worst possible time.

Types of performance tests:

Load test — normal expected load (verify it works)
Stress test — beyond normal load (find the breaking point)
Spike test — sudden traffic surge (simulate viral events)
Soak test — sustained load over hours (find memory leaks)

k6: The Modern Load Testing Tool

k6 is the best tool for most teams — it uses JavaScript for test scripts, has excellent output, and integrates with CI/CD.

# Install
brew install k6                    # macOS
sudo apt install k6                # Ubuntu
docker run grafana/k6 run -        # Docker

Basic Load Test

// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

// Test configuration
export const options = {
    vus: 10,           // 10 virtual users
    duration: '30s',   // run for 30 seconds
};

// This function runs once per virtual user per iteration
export default function () {
    const response = http.get('http://localhost:3000/api/users');

    // Assertions
    check(response, {
        'status is 200':          (r) => r.status === 200,
        'response time < 500ms':  (r) => r.timings.duration < 500,
        'has users array':        (r) => JSON.parse(r.body).length > 0,
    });

    sleep(1);  // wait 1 second between iterations
}

k6 run load-test.js

Ramp-Up Scenario

export const options = {
    stages: [
        { duration: '30s', target: 20 },   // ramp up to 20 users
        { duration: '1m',  target: 100 },  // ramp up to 100 users
        { duration: '2m',  target: 100 },  // stay at 100 users
        { duration: '30s', target: 0 },    // ramp down
    ],
    thresholds: {
        // Fail the test if these are exceeded
        http_req_duration: ['p(95)<500'],  // 95% of requests under 500ms
        http_req_failed:   ['rate<0.01'],  // less than 1% errors
    },
};

POST Request with Authentication

import http from 'k6/http';
import { check } from 'k6';

const BASE_URL = 'http://localhost:3000';

// Setup: runs once before the test
export function setup() {
    const loginRes = http.post(`${BASE_URL}/api/auth/login`, JSON.stringify({
        email: '[email protected]',
        password: 'password123',
    }), { headers: { 'Content-Type': 'application/json' } });

    check(loginRes, { 'login successful': (r) => r.status === 200 });
    return { token: loginRes.json('token') };
}

export default function (data) {
    const headers = {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${data.token}`,
    };

    // Create an order
    const createRes = http.post(`${BASE_URL}/api/orders`,
        JSON.stringify({ productId: 1, quantity: 2 }),
        { headers }
    );

    check(createRes, {
        'order created':    (r) => r.status === 201,
        'has order id':     (r) => r.json('id') !== undefined,
        'fast response':    (r) => r.timings.duration < 1000,
    });
}

Custom Metrics

import { Counter, Rate, Trend, Gauge } from 'k6/metrics';

const orderErrors   = new Counter('order_errors');
const orderDuration = new Trend('order_duration_ms');
const successRate   = new Rate('order_success_rate');

export default function () {
    const start = Date.now();
    const res = http.post('/api/orders', payload, { headers });
    const duration = Date.now() - start;

    orderDuration.add(duration);

    if (res.status === 201) {
        successRate.add(1);
    } else {
        successRate.add(0);
        orderErrors.add(1);
    }
}

k6 Output

✓ status is 200
✓ response time < 500ms

checks.........................: 98.50% ✓ 5910 ✗ 89
data_received..................: 12 MB  398 kB/s
data_sent......................: 1.2 MB 40 kB/s
http_req_blocked...............: avg=1.2ms   min=1µs    med=3µs    max=1.2s
http_req_duration..............: avg=245ms   min=12ms   med=198ms  max=3.2s
  { expected_response:true }...: avg=241ms   min=12ms   med=195ms  max=3.2s
http_req_failed................: 1.48%  ✓ 89 ✗ 5910
http_reqs......................: 5999   199.9/s
iteration_duration.............: avg=1.25s   min=1.01s  med=1.2s   max=4.2s
vus............................: 100    min=0 max=100
vus_max........................: 100    min=100 max=100

What to look for:

http_req_duration p(95) — 95th percentile latency (most important)
http_req_failed — error rate (should be < 1%)
http_reqs — throughput (requests per second)

Artillery: YAML-Based Load Testing

Artillery uses YAML configuration — good for teams that prefer declarative tests:

npm install -g artillery

# load-test.yml
config:
  target: 'http://localhost:3000'
  phases:
    - duration: 60
      arrivalRate: 10    # 10 new users per second
      name: "Warm up"
    - duration: 120
      arrivalRate: 50    # 50 new users per second
      name: "Ramp up"
    - duration: 60
      arrivalRate: 100   # 100 new users per second
      name: "Peak load"
  defaults:
    headers:
      Content-Type: 'application/json'

scenarios:
  - name: "Browse and purchase"
    weight: 70  # 70% of users follow this flow
    flow:
      - get:
          url: '/api/products'
          expect:
            - statusCode: 200
      - post:
          url: '/api/cart'
          json:
            productId: 1
            quantity: 1
      - post:
          url: '/api/orders'
          json:
            paymentMethod: 'card'
          capture:
            - json: '$.id'
              as: 'orderId'
      - get:
          url: '/api/orders/{{ orderId }}'

  - name: "Just browse"
    weight: 30  # 30% of users just browse
    flow:
      - get:
          url: '/api/products'
      - get:
          url: '/api/products/1'

artillery run load-test.yml
artillery run load-test.yml --output results.json
artillery report results.json  # generates HTML report

Autocannon: Fast HTTP Benchmarking

Autocannon is great for quick benchmarks of a single endpoint:

npm install -g autocannon

# Basic benchmark: 10 connections, 30 seconds
autocannon -c 10 -d 30 http://localhost:3000/api/users

# More connections
autocannon -c 100 -d 60 http://localhost:3000/api/users

# POST request
autocannon -c 10 -d 30 \
    -m POST \
    -H "Content-Type: application/json" \
    -b '{"name":"test"}' \
    http://localhost:3000/api/users

Output:

Running 30s test @ http://localhost:3000/api/users
10 connections

┌─────────┬──────┬──────┬───────┬──────┬─────────┬─────────┬──────────┐
│ Stat    │ 2.5% │ 50%  │ 97.5% │ 99%  │ Avg     │ Stdev   │ Max      │
├─────────┼──────┼──────┼───────┼──────┼─────────┼─────────┼──────────┤
│ Latency │ 8 ms │ 12ms │ 28 ms │ 45ms │ 13.2 ms │ 8.1 ms  │ 312.3 ms │
└─────────┴──────┴──────┴───────┴──────┴─────────┴─────────┴──────────┘
┌───────────┬─────────┬─────────┬─────────┬────────┬─────────┬───────┐
│ Stat      │ 1%      │ 2.5%    │ 50%     │ 97.5%  │ Avg     │ Stdev │
├───────────┼─────────┼─────────┼─────────┼────────┼─────────┼───────┤
│ Req/Sec   │ 612     │ 650     │ 756     │ 820    │ 752.3   │ 48.2  │
├───────────┼─────────┼─────────┼─────────┼────────┼─────────┼───────┤
│ Bytes/Sec │ 1.23 MB │ 1.31 MB │ 1.52 MB │ 1.65MB │ 1.51 MB │ 97 kB │
└───────────┴─────────┴─────────┴─────────┴────────┴─────────┴───────┘

22.6k requests in 30.01s, 45.4 MB read

Interpreting Results

Key Metrics

Metric	What it means	Target
p50 (median)	Half of requests faster than this	< 100ms for APIs
p95	95% of requests faster than this	< 500ms
p99	99% of requests faster than this	< 1000ms
Error rate	% of failed requests	< 0.1%
Throughput	Requests per second	Depends on requirements

Why p95/p99 matter more than average: Averages hide outliers. If 1% of users wait 10 seconds, that’s a real problem even if the average is 100ms.

Finding Bottlenecks

# While running load test, monitor:

# CPU and memory
htop
# or
docker stats

# Database connections
# PostgreSQL
SELECT count(*) FROM pg_stat_activity;

# Node.js event loop lag
# Add to your app:
const { monitorEventLoopDelay } = require('perf_hooks');
const h = monitorEventLoopDelay({ resolution: 20 });
h.enable();
setInterval(() => {
    console.log('Event loop delay p99:', h.percentile(99) / 1e6, 'ms');
    h.reset();
}, 5000);

Common Bottlenecks and Fixes

Symptom	Likely cause	Fix
Latency increases with load	Database connection pool exhausted	Increase pool size
Error rate spikes at N users	Memory exhaustion	Add caching, optimize queries
Consistent high latency	Slow database queries	Add indexes, optimize queries
Latency spikes periodically	Garbage collection	Reduce allocations, tune GC
Errors after sustained load	Memory leak	Profile with clinic.js

CI/CD Integration

# .github/workflows/performance.yml
name: Performance Test

on:
  pull_request:
    branches: [main]

jobs:
  load-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Start application
        run: docker compose up -d
        
      - name: Wait for app
        run: sleep 10

      - name: Run k6 load test
        uses: grafana/[email protected]
        with:
          filename: tests/load-test.js
          flags: --out json=results.json

      - name: Check results
        run: |
          # Fail if p95 > 500ms or error rate > 1%
          node -e "
            const results = require('./results.json');
            const p95 = results.metrics.http_req_duration.values['p(95)'];
            const errorRate = results.metrics.http_req_failed.values.rate;
            if (p95 > 500) process.exit(1);
            if (errorRate > 0.01) process.exit(1);
          "