Introduction
Performance testing ensures your system can handle expected load while meeting response time targets. In 2026, performance testing is not optional—it is a CI gate that prevents regressions before they reach production. This guide covers load testing with k6 and Artillery, two modern scriptable tools for API and web performance testing.
Effective performance testing requires understanding the difference between load, stress, spike, and soak testing. Each type reveals different system behaviors. Together, they give a complete picture of application performance under various conditions.
Testing Types
Performance Testing Types Duration
┌────────────────────────────────────────────┐ ┌──────────────┐
│ Load Testing │ │ 10-30 min │
│ • Normal expected traffic │ │ │
│ • Peak hour simulation │ │ Stress │
│ • Validate response time SLAs │ │ < 10 min │
├────────────────────────────────────────────┤ ├──────────────┤
│ Stress Testing │ │ Spike │
│ • Beyond normal capacity │ │ < 5 min │
│ • Find breaking point │ │ │
│ • Test auto-scaling and recovery │ │ Soak │
├────────────────────────────────────────────┤ ├──────────────┤
│ Spike Testing │ │ Hours │
│ • Sudden traffic surge │ └──────────────┘
│ • Measure scale-up speed │
│ • Cold start behavior │
├────────────────────────────────────────────┤
│ Soak Testing │
│ • Extended period (hours) │
│ • Memory leak detection │
│ • Gradual degradation │
└────────────────────────────────────────────┘
| Type | Purpose | Load Pattern | Duration | Key Metrics |
|---|---|---|---|---|
| Load | Normal expected traffic | Ramp up to target, sustain | 10-30 min | Response time, error rate |
| Stress | Find breaking point | Ramp up until failure | 5-10 min | Max throughput, recovery time |
| Spike | Sudden traffic burst | Immediate high load | 1-5 min | Scale-up latency, cold starts |
| Soak | Long-term stability | Sustained moderate load | 1-24 hours | Memory leak, GC pressure |
k6
Setup
# Install k6 (multiple methods)
brew install k6 # macOS
sudo apt install k6 # Debian/Ubuntu
npm install -D k6 # npm
docker run --rm -i grafana/k6 run - # Docker
# Verify
k6 version
Basic Script
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '30s', target: 10 }, // Ramp up to 10 users
{ duration: '1m', target: 10 }, // Stay at 10 users
{ duration: '30s', target: 0 }, // Ramp down
],
};
export default function () {
const res = http.get('https://api.example.com/users');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}
Running the Test
# Basic run
k6 run load-test.js
# Output with summary
k6 run --summary-export=summary.json load-test.js
# Output as JSON
k6 run --out json=results.json load-test.js
# With Grafana dashboard
k6 run --out influxdb=http://localhost:8086/k6 load-test.js
Advanced Script with Custom Metrics
import http from 'k6/http';
import { check, sleep, Trend, Rate, Counter, Gauge } from 'k6';
import { randomIntBetween } from 'https://jslib.k6.io/k6-utils/1.4.0/index.js';
// Custom metrics
const loginDuration = new Trend('login_duration', true);
const searchDuration = new Trend('search_duration', true);
const errorRate = new Rate('error_rate');
const totalOrders = new Counter('total_orders');
const activeUsers = new Gauge('active_users');
export const options = {
stages: [
{ duration: '1m', target: 50 },
{ duration: '3m', target: 50 },
{ duration: '1m', target: 100 },
{ duration: '3m', target: 100 },
{ duration: '1m', target: 0 },
],
thresholds: {
http_req_duration: ['p(95)<800', 'p(99)<1500'],
login_duration: ['p(95)<2000'],
error_rate: ['rate<0.05'],
http_req_failed: ['rate<0.01'],
},
tags: {
environment: 'staging',
release: process.env.RELEASE_VERSION || 'latest',
},
};
export default function () {
activeUsers.add(__VU);
const baseUrl = __ENV.BASE_URL || 'https://staging.example.com';
// Login flow
const loginPayload = JSON.stringify({
email: `user-${__VU}@test.com`,
password: 'test-password',
});
const loginRes = http.post(`${baseUrl}/api/auth/login`, loginPayload, {
headers: { 'Content-Type': 'application/json' },
tags: { name: 'login' },
});
loginDuration.add(loginRes.timings.duration);
const loggedIn = check(loginRes, {
'login status 200': (r) => r.status === 200,
'login has token': (r) => r.json('token') !== undefined,
});
if (!loggedIn) {
errorRate.add(1);
sleep(1);
return;
}
const token = loginRes.json('token');
const authHeaders = {
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json',
},
};
// Search products
const searchRes = http.get(
`${baseUrl}/api/products?q=wireless&page=${randomIntBetween(1, 5)}`,
authHeaders
);
searchDuration.add(searchRes.timings.duration);
check(searchRes, {
'search status 200': (r) => r.status === 200,
'search has results': (r) => r.json('products').length > 0,
});
// Create order (every 3rd iteration)
if (__ITER % 3 === 0) {
const orderRes = http.post(
`${baseUrl}/api/orders`,
JSON.stringify({
productId: 'prod-123',
quantity: 1,
shippingAddress: {
street: '123 Test St',
city: 'Portland',
state: 'OR',
zip: '97201',
},
}),
authHeaders
);
totalOrders.add(1);
check(orderRes, {
'order status 201': (r) => r.status === 201,
'order has id': (r) => r.json('id') !== undefined,
});
}
sleep(randomIntBetween(0.5, 2));
}
Thresholds and Assertions
Thresholds define pass/fail criteria for performance tests. If a threshold is breached, k6 exits with a non-zero code—useful for CI gating.
export const options = {
thresholds: {
// Response time percentiles
http_req_duration: ['p(50)<200', 'p(95)<500', 'p(99)<1000'],
// Error rate
http_req_failed: ['rate<0.01'],
// Custom metric thresholds
login_duration: ['p(95)<2000', 'max<5000'],
error_rate: ['rate<0.05'],
// Group-level thresholds
'group_duration{group:checkout}': ['p(95)<3000'],
// All requests combined
http_reqs: ['count>100'],
},
};
Scenarios
k6 scenarios allow running multiple workload patterns in a single test.
export const options = {
scenarios: {
smoke: {
executor: 'constant-vus',
vus: 1,
duration: '1m',
tags: { type: 'smoke' },
},
load: {
executor: 'ramping-vus',
startVUs: 0,
stages: [
{ duration: '2m', target: 100 },
{ duration: '5m', target: 100 },
{ duration: '2m', target: 0 },
],
startTime: '2m',
tags: { type: 'load' },
},
stress: {
executor: 'ramping-arrival-rate',
startRate: 10,
timeUnit: '1s',
preAllocatedVUs: 50,
maxVUs: 200,
stages: [
{ duration: '2m', target: 200 },
{ duration: '5m', target: 200 },
{ duration: '2m', target: 0 },
],
startTime: '10m',
tags: { type: 'stress' },
},
},
};
Data Parameterization
Tests should use realistic data, not the same values for every virtual user.
import { SharedArray } from 'k6/data';
import papaparse from 'https://jslib.k6.io/papaparse/5.1.1/index.js';
// Load test data from CSV
const testData = new SharedArray('users', function () {
const csv = open('/path/to/test-users.csv');
return papaparse.parse(csv, { header: true }).data;
});
export default function () {
const user = testData[__VU % testData.length];
const res = http.post('https://api.example.com/login', {
email: user.email,
password: user.password,
});
check(res, {
'login success': (r) => r.status === 200,
});
}
k6 Browser Testing
k6 can run browser-level tests alongside protocol-level requests for realistic load simulation.
import { browser } from 'k6/experimental/browser';
import { check } from 'k6';
export const options = {
scenarios: {
browser_test: {
executor: 'constant-vus',
vus: 5,
duration: '2m',
exec: 'browserTest',
options: {
browser: { type: 'chromium' },
},
},
},
};
export async function browserTest() {
const page = browser.newPage();
try {
await page.goto('https://test.k6.io/my_messages.php', { waitUntil: 'networkidle' });
page.locator('input[name="login"]').type('admin');
page.locator('input[name="password"]').type('123');
const submitButton = page.locator('input[type="submit"]');
await Promise.all([
page.waitForNavigation(),
submitButton.click(),
]);
check(page, {
'header is visible': (p) => p.locator('h2').textContent() === 'Welcome, admin!',
});
} finally {
page.close();
}
}
Results Output and Dashboards
# JSON output for custom processing
k6 run --out json=results.json load-test.js
# CSV output
k6 run --out csv=results.csv load-test.js
# InfluxDB for Grafana dashboards
k6 run --out influxdb=http://localhost:8086/k6 load-test.js
# Cloud output (k6 Cloud)
k6 run --out cloud load-test.js
Sample JSON output:
{
"type": "Point",
"data": {
"time": "2026-05-24T10:00:00Z",
"value": 245,
"tags": {
"name": "http_req_duration",
"method": "GET",
"url": "https://api.example.com/users",
"status": "200"
}
}
}
Artillery
Setup
npm install -D artillery
# Quick test
npx artillery quick --duration 30 --rate 10 https://api.example.com
# Run config file
npx artillery run test-config.yml
Advanced YAML Configuration
# artillery-config.yml
config:
target: "https://api.example.com"
phases:
- duration: 60
arrivalRate: 5
name: "Warm up"
- duration: 120
arrivalRate: 20
name: "Sustained load"
- duration: 30
arrivalRate: 50
name: "Stress peak"
- duration: 60
arrivalRate: 5
name: "Recovery check"
validate: true
plugins:
expect: {}
metrics-by-endpoint:
enabled: true
publish-metrics:
- type: console
- type: datadog
apiKey: "{{$process.env.DATADOG_API_KEY}}"
processor: "./handlers.js"
variables:
userIds:
- "user-001"
- "user-002"
- "user-003"
scenarios:
- name: "Browse and purchase"
flow:
- get:
url: "/api/products"
capture:
- json: "$.products[0].id"
as: "productId"
- json: "$.products[0].slug"
as: "slug"
expect:
- statusCode: 200
- contentType: json
- think: 1
- post:
url: "/api/cart"
json:
productId: "{{ productId }}"
quantity: 1
capture:
- json: "$.cartId"
as: "cartId"
expect:
- statusCode: 201
- think: 0.5
- post:
url: "/api/checkout"
json:
cartId: "{{ cartId }}"
payment:
method: "card"
token: "tok_test_1234"
expect:
- statusCode: 200
- jsonPath: "$.status"
value: "completed"
Custom Processor
// handlers.js
'use strict';
module.exports = {
beforeRequestHandler: (requestParams, context, ee, next) => {
// Add dynamic timestamp to requests
requestParams.headers['X-Timestamp'] = Date.now();
// Rate limiting simulation
const delay = Math.random() * 200;
setTimeout(() => next(), delay);
},
afterResponseHandler: (requestParams, response, context, ee, next) => {
// Log slow requests
if (response.timings.physics.total > 2000) {
console.log(`SLOW: ${response.request.uri.path} took ${response.timings.physics.total}ms`);
ee.emit('counter', 'slow_requests', 1);
}
return next();
},
};
Running and Reporting
# Run with HTML report
npx artillery run test-config.yml --output report.json
npx artillery report report.json
# Run with tags
npx artillery run test-config.yml --tags "environment:staging,release:v2.1"
# Run multiple workers
npx artillery run test-config.yml --worker 4
Locust
Locust is a Python-based load testing tool ideal for teams in the Python ecosystem.
Setup
pip install locust
Test Script
"""Load test for e-commerce API."""
from locust import HttpUser, task, between, tag
import json
class EcommerceUser(HttpUser):
"""Simulated e-commerce user behavior."""
wait_time = between(0.5, 3)
def on_start(self):
"""Login on user start."""
response = self.client.post("/api/auth/login", json={
"email": f"user-{self.id}@test.com",
"password": "test-password",
})
self.token = response.json().get("token")
self.headers = {"Authorization": f"Bearer {self.token}"}
@tag("products", "read")
@task(3)
def browse_products(self):
"""Browse product listings."""
self.client.get("/api/products?page=1&limit=20", headers=self.headers)
@tag("products", "read")
@task(1)
def search_products(self):
"""Search for specific products."""
self.client.get(
"/api/products?q=wireless+headphones",
headers=self.headers,
)
@tag("cart", "write")
@task(2)
def add_to_cart(self):
"""Add product to shopping cart."""
self.client.post("/api/cart", json={
"productId": "prod-123",
"quantity": 1,
}, headers=self.headers)
@tag("checkout", "write")
@task(1)
def checkout(self):
"""Complete purchase flow."""
self.client.post("/api/checkout", json={
"paymentMethod": "card",
"shippingAddress": {"street": "123 Test St", "city": "Portland"},
}, headers=self.headers)
Running
# Web UI mode
locust --host=https://staging.example.com --web-port=8089
# Headless mode (for CI)
locust --host=https://staging.example.com \
--headless \
--users 100 \
--spawn-rate 10 \
--run-time 10m \
--html report.html \
--csv results
Frontend Performance Testing
Lighthouse CI
Frontend performance testing measures real browser rendering, not just API response times.
# Install
npm install -D @lhci/cli
# Run Lighthouse CI
lhci autorun --collect.url=https://example.com \
--collect.numberOfRuns=3 \
--upload.target=temporary-public-storage
# lighthouserc.yml
ci:
collect:
numberOfRuns: 3
staticDistDir: ./dist
url:
- /
- /products
- /checkout
assert:
assertions:
categories:performance:
- error
- minScore: 0.85
categories:accessibility:
- warn
- minScore: 0.90
categories:seo:
- warn
- minScore: 0.90
budgets:
- error
upload:
target: temporary-public-storage
Playwright Performance Metrics
import { test, expect } from '@playwright/test';
test('page performance meets budgets', async ({ page }) => {
await page.goto('/dashboard');
// Measure Core Web Vitals
const metrics = await page.evaluate(() => ({
lcp: performance.getEntriesByName('largest-contentful-paint')[0]?.startTime,
fcp: performance.getEntriesByName('first-contentful-paint')[0]?.startTime,
cls: performance.getEntriesByName('layout-shift')
?.reduce((sum, entry: any) => sum + entry.value, 0),
tbt: await new Promise((resolve) => {
new PerformanceObserver((list) => {
resolve(list.getEntries().reduce((sum: number, entry: any) => sum + entry.duration, 0));
}).observe({ type: 'longtask', buffered: true });
}),
}));
// Assert budgets
expect(metrics.lcp).toBeLessThan(2500); // 2.5s LCP budget
expect(metrics.fcp).toBeLessThan(1500); // 1.5s FCP budget
expect(metrics.cls).toBeLessThan(0.1); // 0.1 CLS budget
expect(metrics.tbt).toBeLessThan(200); // 200ms TBT budget
});
Metrics and Targets
| Metric | Description | Good Target | Warning | Critical |
|---|---|---|---|---|
| p50 response time | Median response | < 200ms | 200-500ms | > 500ms |
| p95 response time | 95th percentile | < 500ms | 500-1000ms | > 1000ms |
| p99 response time | 99th percentile | < 1000ms | 1000-2000ms | > 2000ms |
| Error rate | Failed requests | < 0.1% | 0.1-1% | > 1% |
| Throughput | Requests/second | Varies by service | 10% drop | > 20% drop |
| CPU usage | Server CPU | < 70% | 70-85% | > 85% |
| Memory usage | Server memory | < 75% | 75-90% | > 90% |
Service-Level Objectives (SLOs)
| Service Type | Latency SLO (p95) | Availability SLO |
|---|---|---|
| Public API | < 500ms | 99.9% |
| Internal API | < 200ms | 99.99% |
| Database | < 50ms | 99.999% |
| Cache | < 5ms | 99.999% |
| File storage | < 1000ms | 99.9% |
| Notification | < 3000ms | 99.5% |
CI/CD Integration
GitHub Actions with k6
# .github/workflows/performance.yml
name: Performance Tests
on:
pull_request:
paths:
- 'src/**'
- 'api/**'
schedule:
- cron: '0 6 * * *' # Daily at 6 AM
jobs:
load-test:
runs-on: ubuntu-latest
services:
app:
image: ${{ github.repository }}:${{ github.sha }}
ports:
- 3000:3000
steps:
- uses: actions/checkout@v4
- name: Install k6
run: |
curl -s https://dl.k6.io/install.sh | bash
sudo mv k6 /usr/local/bin/
- name: Smoke test
run: k6 run --vus 1 --duration 30s tests/smoke.js
env:
BASE_URL: http://localhost:3000
- name: Load test
if: github.ref == 'refs/heads/main'
run: k6 run tests/load.js --out json=k6-results.json
env:
BASE_URL: http://localhost:3000
- name: Check thresholds
run: |
k6 run tests/thresholds.js
env:
BASE_URL: http://localhost:3000
- name: Upload results
uses: actions/upload-artifact@v4
with:
name: k6-results
path: k6-results.json
lighthouse:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- run: npm ci
- run: npm run build
- name: Run Lighthouse CI
run: npx lhci autorun
env:
LHCI_GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GitLab CI
# .gitlab-ci.yml
performance:
stage: test
image: grafana/k6:latest
script:
- k6 run tests/load.js --out json=k6-results.json
artifacts:
paths:
- k6-results.json
only:
- main
variables:
BASE_URL: $CI_ENVIRONMENT_URL
Performance Gates
# Performance gating in CI
performance_gate:
rules:
- if: $PERFORMANCE_CRITICAL == "true"
steps:
- run: k6 run tests/critical.js
env:
P95_THRESHOLD: 500
ERROR_RATE_THRESHOLD: 0.01
- name: Gate check
if: failure()
run: |
echo "❌ Performance regression detected!"
echo "Blocking deployment until resolved."
exit 1
Performance Budgets
Performance budgets define hard limits for application performance. Exceeding a budget fails the build.
{
"performance-budgets": {
"api": {
"/api/orders": {
"p95": 1000,
"errorRate": 0.01
},
"/api/products": {
"p95": 500,
"errorRate": 0.005
}
},
"frontend": {
"lcp": 2500,
"fcp": 1500,
"cls": 0.1,
"tti": 3000,
"totalBundleSize": 500000
}
}
}
Lighthouse Budgets
{
"ci": {
"assert": {
"budgets": [
{
"resourceSizes": [
{ "resourceType": "script", "budget": 300 },
{ "resourceType": "total", "budget": 500 }
],
"resourceCounts": [
{ "resourceType": "script", "budget": 10 },
{ "resourceType": "third-party", "budget": 5 }
],
"timings": [
{ "metric": "interactive", "budget": 5000 },
{ "metric": "first-meaningful-paint", "budget": 2000 }
]
}
]
}
}
}
Analyzing Results
k6 Summary Output
data_received..................: 12 MB 123 kB/s
data_sent......................: 3.4 MB 35 kB/s
http_req_blocked...............: avg=2.1ms min=1µs med=2µs max=452ms
http_req_connecting............: avg=1.2ms min=0s med=0s max=245ms
http_req_duration..............: avg=245ms min=12ms med=198ms max=1890ms
{ expected_response:true }...: avg=245ms min=12ms med=198ms max=1890ms
http_req_failed................: 0.45% ✓ 45 ✗ 9955
http_req_receiving.............: avg=0.1ms min=0s med=0.1ms max=3.5ms
http_req_sending...............: avg=0.02ms min=0s med=0.02ms max=0.5ms
http_req_tls_handshaking.......: avg=0.8ms min=0s med=0s max=189ms
http_req_waiting...............: avg=243ms min=12ms med=196ms max=1887ms
http_reqs......................: 10000 102.5/s
iteration_duration.............: avg=1.25s min=1.01s med=1.2s max=2.9s
iterations.....................: 10000 102.5/s
vus............................: 1 min=1 max=50
vus_max........................: 50 min=50 max=50
Key Patterns to Look For
| Pattern | Indication | Action |
|---|---|---|
| Linear latency increase | Overloaded system | Scale up or optimize bottleneck |
| Error rate spike at threshold | Capacity limit reached | Add auto-scaling or rate limiting |
| Gradual response time growth | Memory leak | Profile heap usage over time |
| High p99 vs p50 | Intermittent stalls (GC, cold starts) | Investigate tail latency sources |
| Connection time growth | Connection pool exhaustion | Increase pool size or reuse connections |
| Throughput plateau | Hardware or software limit | Profile and optimize critical path |
Grafana Dashboard Integration
# Run k6 with InfluxDB output
k6 run --out influxdb=http://influxdb:8086/k6 load-test.js
# Or use Prometheus remote write
k6 run --out experimental-prometheus-rw=http://prometheus:9090/api/v1/write load-test.js
Create Grafana dashboards with panels for:
- Request rate and error rate over time
- Response time percentiles (p50, p95, p99)
- Virtual users and resource utilization
- Custom business metrics (orders, logins, searches)
Load Testing Patterns for Microservices
Service Dependency Matrix
| Service | Dependencies | Critical Path | Load Pattern |
|---|---|---|---|
| API Gateway | Auth, Product, Order | High | Spike (user-facing) |
| Auth Service | Database, Cache | High | Steady (token refresh) |
| Product Service | Database, Search | Medium | Read-heavy |
| Order Service | Database, Payment, Notification | High | Write-heavy burst |
| Notification | Email, SMS, Push | Low | Async queue |
Testing Individual Services
// Test a single service in isolation
import http from 'k6/http';
import { check } from 'k6';
export const options = {
vus: 20,
duration: '5m',
};
export default function () {
// Target only the product service directly
const res = http.get('http://product-service:3000/api/products?page=1');
check(res, {
'product service responds': (r) => r.status === 200,
'under 100ms': (r) => r.timings.duration < 100,
});
}
Best Practices
1. Test in Production-Like Environments
Staging environments often differ significantly from production. Use production snapshots for database size and realistic traffic patterns.
2. Always Set Thresholds
Tests without thresholds provide data but no pass/fail signal. Always define thresholds for CI gating.
// Always set thresholds
thresholds: {
http_req_duration: ['p(95)<500'],
http_req_failed: ['rate<0.01'],
}
3. Parameterize Test Data
Hardcoded test data produces unrealistic results. Use CSV files, data generators, or API calls to create dynamic test data.
4. Warm Up Before Measurement
JIT compilation, connection pools, and cache populations affect initial requests. Include a warm-up phase before collecting metrics.
stages: [
{ duration: '2m', target: 10 }, // Warm up
{ duration: '5m', target: 10 }, // Measure
{ duration: '1m', target: 0 }, // Ramp down
]
5. Monitor Both Sides
Monitor server-side metrics (CPU, memory, database connections) alongside client-side metrics (response time, error rate). Server metrics explain client metrics.
6. Test Recovery
A system that fails under load is acceptable if it recovers gracefully. Test auto-scaling, circuit breaker reset, and database failover scenarios.
7. Automate Performance Regression Detection
Run performance tests on every PR for critical paths. Use statistical comparison (not fixed thresholds) to detect regressions.
# Compare with baseline
k6 run --baseline=baseline.json load-test.js
8. Use Realistic Think Times
Real users do not hammer endpoints continuously. Use realistic think times between actions.
import { randomIntBetween } from 'https://jslib.k6.io/k6-utils/1.4.0/index.js';
sleep(randomIntBetween(1, 5)); // Realistic user think time
Key Takeaways
- Test types matter — Load, stress, spike, and soak test different failure modes
- k6 is scriptable — JavaScript-based, runs in CI, supports browser testing
- Artillery is YAML-first — Easy configuration, good for API-centric testing
- Set thresholds — Every performance test should have pass/fail criteria
- Frontend matters — Lighthouse CI catches rendering performance regressions
- CI integration — Run smoke tests on every PR, full load tests nightly
- Performance budgets — Hard limits prevent regressions from reaching production
- Automate analysis — Compare against baselines, not arbitrary numbers
Resources
- k6 Documentation — Official k6 guide with examples
- Artillery Documentation — YAML-based load testing
- k6 Cloud — Managed load testing infrastructure
- Locust Documentation — Python-based load testing
- Lighthouse CI — Frontend performance in CI
- Web Vitals — Core Web Vitals guide
- Grafana k6 Dashboard — Pre-built performance dashboards
- Playwright Performance Testing — Browser-level metrics
- Performance Budgets — Setting and enforcing budgets
Comments