Introduction
An API gateway is a reverse proxy that sits between clients and backend services. It centralizes cross-cutting concerns โ routing, authentication, rate limiting, logging, and request transformation โ so individual services don’t have to reimplement them. In a microservices architecture, the gateway becomes the single entry point clients talk to, while backend services remain isolated and focused on business logic.
Without a gateway, every client must know the locations of every service, handle authentication independently, and implement retry/circuit-breaking logic. A gateway eliminates this duplication and gives operations teams a single control plane for traffic management.
Core Gateway Functions
Request Routing
A gateway routes incoming requests to the correct backend service based on attributes of the request โ the URL path, HTTP method, headers, or even the request body. Modern gateways support several routing strategies:
- Prefix routing:
/api/users/*matches any subpath under/api/users - Exact path routing:
/healthonly matches that exact path - Header-based routing: route based on
X-API-Version: v2 - Method-based routing:
GET /api/ordersvsPOST /api/ordershit different backends - Query-parameter routing: route by
?region=eu-west
// Prefix-based router with middleware chain
package gateway
import (
"fmt"
"net/http"
"net/http/httputil"
"net/url"
"strings"
)
type Route struct {
Prefix string
Target *url.URL
Methods []string
}
type Gateway struct {
routes []Route
}
func (g *Gateway) ServeHTTP(w http.ResponseWriter, r *http.Request) {
for _, route := range g.routes {
if !strings.HasPrefix(r.URL.Path, route.Prefix) {
continue
}
if !methodAllowed(r.Method, route.Methods) {
http.Error(w, `{"error":"method not allowed"}`, http.StatusMethodNotAllowed)
return
}
proxy := httputil.NewSingleHostReverseProxy(route.Target)
proxy.ServeHTTP(w, r)
return
}
http.Error(w, `{"error":"no matching route"}`, http.StatusNotFound)
}
func methodAllowed(method string, allowed []string) bool {
if len(allowed) == 0 {
return true
}
for _, m := range allowed {
if strings.EqualFold(m, method) {
return true
}
}
return false
}
Load Balancing
Gateways distribute requests across healthy backend instances. Common algorithms include round-robin, least-connections, and IP-hash for session persistence. Health checks (active or passive) remove unhealthy targets from the pool.
# Traefik dynamic configuration with load balancing
http:
routers:
user-api:
rule: "PathPrefix(`/api/users`)"
service: user-service
middlewares:
- auth-jwt
- rate-limit
services:
user-service:
loadBalancer:
servers:
- url: "http://10.0.1.10:8080"
- url: "http://10.0.1.11:8080"
- url: "http://10.0.1.12:8080"
healthCheck:
path: "/health"
interval: "10s"
timeout: "3s"
Authentication Methods
JWT (JSON Web Tokens)
The gateway validates JWT tokens on every request before forwarding to the backend. This keeps auth logic out of individual services. The gateway extracts the token from the Authorization header, verifies the signature using a shared secret or public key, and optionally injects claims into downstream headers.
import jwt
import os
from datetime import datetime, timedelta
JWT_SECRET = os.environ.get("JWT_SECRET", "change-me-in-production")
JWT_ALGORITHM = "HS256"
def create_token(user_id: str, roles: list[str]) -> str:
"""Issue a signed JWT with expiry and custom claims."""
payload = {
"sub": user_id,
"roles": roles,
"iat": datetime.utcnow(),
"exp": datetime.utcnow() + timedelta(hours=1),
}
return jwt.encode(payload, JWT_SECRET, algorithm=JWT_ALGORITHM)
def verify_gateway_token(token: str) -> dict | None:
"""Validate and decode a JWT. Returns claims or None on failure."""
try:
claims = jwt.decode(
token,
JWT_SECRET,
algorithms=[JWT_ALGORITHM],
options={"require": ["exp", "sub"]},
)
return claims
except jwt.ExpiredSignatureError:
return None
except jwt.InvalidTokenError:
return None
OAuth2 / OpenID Connect
For delegated authorization, the gateway acts as an OAuth2 client or proxy. It redirects unauthenticated users to the identity provider, handles the callback, and sets a session cookie or JWT. Tools like OAuth2 Proxy integrate directly with Kong and Traefik.
# Kong OAuth2 plugin configuration
plugins:
- name: oauth2
service: user-service
config:
scopes: ["profile", "email"]
mandatory_scope: true
token_expiration: 7200
enable_authorization_code: true
enable_client_credentials: true
provision_key: "your-provision-key"
API Keys
API keys are the simplest auth mechanism โ the gateway checks a static key in the header or query string against a whitelist. Use API keys for service-to-service communication or public-facing SDKs where JWT complexity is unwarranted.
func apiKeyMiddleware(allowedKeys map[string]string) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
key := r.Header.Get("X-API-Key")
if key == "" {
key = r.URL.Query().Get("api_key")
}
if key == "" {
http.Error(w, `{"error":"missing api key"}`, http.StatusUnauthorized)
return
}
owner, ok := allowedKeys[key]
if !ok {
http.Error(w, `{"error":"invalid api key"}`, http.StatusForbidden)
return
}
r.Header.Set("X-API-Key-Owner", owner)
next.ServeHTTP(w, r)
})
}
}
Kong JWT Route Registration
# Register a service and protect it with JWT authentication
$ curl -s -X POST http://localhost:8001/services \
--data name=user-service \
--data url=http://user-service:8080
$ curl -s -X POST http://localhost:8001/services/user-service/routes \
--data name=user-route \
--data paths[]=/api/users
$ curl -s -X POST http://localhost:8001/services/user-service/plugins \
--data name=jwt
# Create a consumer and issue a JWT credential
$ curl -s -X POST http://localhost:8001/consumers \
--data username=alice
$ curl -s -X POST http://localhost:8001/consumers/alice/jwt \
--data algorithm=HS256 \
--data secret=my-secret-key
# Test the protected endpoint
$ curl -s -o /dev/null -w "%{http_code}" http://localhost:8000/api/users
# => 401
# Include a valid JWT
$ curl -s -H "Authorization: Bearer <jwt-token>" http://localhost:8000/api/users
# => 200
Rate Limiting Algorithms
Token Bucket
The token bucket algorithm maintains a bucket that fills at a constant rate (e.g., 10 tokens per second). Each request consumes one token. If the bucket is empty, the request is rejected. Bursts are allowed up to the bucket capacity.
import time
import threading
class TokenBucket:
"""Thread-safe token bucket rate limiter."""
def __init__(self, capacity: int, fill_rate: float):
self.capacity = capacity
self.tokens = capacity
self.fill_rate = fill_rate
self.last_refill = time.monotonic()
self.lock = threading.Lock()
def _refill(self):
now = time.monotonic()
elapsed = now - self.last_refill
new_tokens = elapsed * self.fill_rate
self.tokens = min(self.capacity, self.tokens + new_tokens)
self.last_refill = now
def allow(self) -> bool:
with self.lock:
self._refill()
if self.tokens >= 1:
self.tokens -= 1
return True
return False
limiter = TokenBucket(capacity=10, fill_rate=1.0)
for _ in range(12):
if limiter.allow():
print("โ
request allowed")
else:
print("โ rate limited")
Leaky Bucket
The leaky bucket treats requests like water poured into a bucket with a hole at the bottom. The bucket processes requests at a fixed rate regardless of incoming burst size. Excess requests overflow and are dropped. This enforces a strict processing rate with no bursting.
import asyncio
import time
class LeakyBucket:
"""Leaky bucket that processes requests at a fixed rate."""
def __init__(self, rate: float, capacity: int):
self.rate = rate # requests per second
self.capacity = capacity
self.water = 0
self.last_leak = time.monotonic()
async def allow(self) -> bool:
now = time.monotonic()
self.water = max(0, self.water - (now - self.last_leak) * self.rate)
self.last_leak = now
if self.water < self.capacity:
self.water += 1
return True
return False
Sliding Window Log
The sliding window algorithm maintains a log of timestamps for each request. On every request, it removes entries older than the window (e.g., 60 seconds) and counts the remaining entries. If the count exceeds the limit, the request is rejected. This is more accurate than fixed-window counters because it avoids boundary spikes.
from collections import defaultdict
import time
class SlidingWindowCounter:
"""Sliding window rate limiter per client IP."""
def __init__(self, window_seconds: int = 60, max_requests: int = 100):
self.window = window_seconds
self.max_req = max_requests
self.clients: dict[str, list[float]] = defaultdict(list)
def allow(self, client_id: str) -> bool:
now = time.monotonic()
cutoff = now - self.window
timestamps = self.clients[client_id]
# Prune expired entries
while timestamps and timestamps[0] < cutoff:
timestamps.pop(0)
if len(timestamps) >= self.max_req:
return False
timestamps.append(now)
return True
Traefik Rate Limit Middleware
# Traefik rate limiting with token bucket semantics
http:
middlewares:
api-rate-limit:
rateLimit:
average: 100
burst: 50
period: 1m
sourceCriterion:
ipStrategy:
depth: 1
routers:
api:
rule: "PathPrefix(`/api`)"
middlewares:
- api-rate-limit
service: backend
Request and Response Transformation
Gateways can modify requests before forwarding them to backends and responses before returning them to clients. Common transformations include:
- Header injection: Add
X-Request-Id,X-User-ID, or correlation headers - Path rewriting:
/api/v2/usersโ/users - Response aggregation: Combine multiple upstream responses into a single response
- Protocol translation: Accept HTTP/1.1 and forward to gRPC backends
- Response compression: Add gzip/brotli encoding
# Kong request transformer plugin
plugins:
- name: request-transformer
service: user-service
config:
add:
headers:
- "X-Gateway-Version: 1.0"
- "X-Forwarded-Proto: https"
rename:
headers:
- "Authorization: X-Upstream-Auth"
append:
querystring:
- "source: gateway"
- name: response-transformer
service: user-service
config:
add:
headers:
- "X-Response-Time: ${latency}"
remove:
headers:
- "X-Internal-Trace"
# Test header injection via curl
$ curl -s -D- http://localhost:8000/api/users/me \
-H "Authorization: Bearer <token>" | head -n 20
# Expected response headers include:
# X-Gateway-Version: 1.0
# X-Request-Id: abc-123-def
# Content-Type: application/json
Popular API Gateway Comparison
| Feature | Kong | Traefik | AWS API Gateway | Envoy |
|---|---|---|---|---|
| Type | Proxy (NGINX-based) | Reverse proxy (Go) | Managed (AWS) | Sidecar/proxy (C++) |
| Deployment | Standalone / DB-less | Binary / K8s Ingress | Fully managed | Sidecar / Gateway |
| Routing | Path, header, method, regex | Path, host, header, query | Path, method, stage | Path, header, method, weight |
| Auth plugins | JWT, OAuth2, OIDC, LDAP, HMAC | JWT, OIDC, Basic, ForwardAuth | Cognito, Lambda, IAM, JWT | JWT, OAuth2, ext-authz |
| Rate limiting | Token bucket, sliding window | Token bucket | Token bucket, burst | Per-route, regional |
| Health checks | Active + passive | Active + passive | Route53 + CloudWatch | Active + passive + outlier |
| Service mesh | Via Kuma | Native mesh (K8s CRDs) | N/A | Istio, Consul, AWS Mesh |
| Performance | ~5k req/s per core | ~10k req/s per core | Scales automatically | ~15k req/s per core |
| License | Apache 2.0 (OSS) | MIT | Proprietary | Apache 2.0 |
| Best for | Enterprise API management | Cloud-native / K8s | Serverless / AWS stack | High-performance mesh |
Security Considerations
A gateway is a security boundary. Misconfiguring it exposes your entire backend. Follow these practices:
- TLS termination: Always terminate TLS at the gateway. Never forward plain HTTP to backends unless they’re on the same private subnet.
- Request validation: Validate Content-Type, Content-Length, and reject malformed payloads before they reach services.
func validationMiddleware(maxBodyBytes int64) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.ContentLength > maxBodyBytes {
http.Error(w, `{"error":"request too large"}`, http.StatusRequestEntityTooLarge)
return
}
ct := r.Header.Get("Content-Type")
if r.Method == "POST" || r.Method == "PUT" || r.Method == "PATCH" {
if ct != "application/json" {
http.Error(w, `{"error":"unsupported media type"}`, http.StatusUnsupportedMediaType)
return
}
}
next.ServeHTTP(w, r)
})
}
}
- IP allow/deny lists: Restrict access to admin endpoints by source IP range.
- CORS: Configure strict origin, method, and header allowlists. Do not use
Access-Control-Allow-Origin: *in production. - Rate limit by endpoint: Apply different limits per route โ login endpoints get a stricter limit than read-only GET endpoints.
# Kong IP restriction plugin
$ curl -s -X POST http://localhost:8001/services/admin-api/plugins \
--data name=ip-restriction \
--data config.allow[]="10.0.0.0/8" \
--data config.allow[]="172.16.0.0/12"
# Response
# => {"config":{"allow":["10.0.0.0/8","172.16.0.0/12"]},"enabled":true,...}
Performance Optimization
The gateway is in the critical path of every request. Optimize it ruthlessly:
- Connection pooling: Reuse upstream connections instead of opening new ones per request. Kong and Envoy pool connections by default.
- TLS session resumption: Enable TLS session tickets and session IDs to reduce handshake overhead.
- Caching: Cache idempotent responses (GET /api/products) at the gateway level with a short TTL.
- Timeouts: Set connect, read, and write timeouts to prevent slow upstreams from consuming gateway resources.
# Kong proxy performance tuning
env:
KONG_PROXY_LISTEN: "0.0.0.0:8000"
KONG_UPSTREAM_KEEPALIVE_POOL_SIZE: "256"
KONG_UPSTREAM_KEEPALIVE_MAX_REQUESTS: "1000"
KONG_UPSTREAM_KEEPALIVE_IDLE_TIMEOUT: "60"
KONG_NGINX_PROXY_CONNECT_TIMEOUT: "5s"
KONG_NGINX_PROXY_READ_TIMEOUT: "30s"
KONG_NGINX_PROXY_SEND_TIMEOUT: "30s"
Deployment Patterns
Per-Team Gateway
Each team deploys their own gateway instance for their microservices. Teams own their routing and auth. This scales well but duplicates infrastructure.
Shared Gateway (Centralized)
A single gateway instance routes to all backend services across the organization. The operations team owns the gateway. This centralizes control but creates a single point of failure and a deployment bottleneck.
Sidecar Gateway (Service Mesh)
Each service instance runs a gateway sidecar (Envoy) that handles inter-service communication. An external gateway (also Envoy) handles ingress. This is the Istio/Linkerd pattern โ full control with fine-grained per-service policies.
Kong Ingress Controller on Kubernetes
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-gateway
annotations:
kubernetes.io/ingress.class: kong
spec:
rules:
- host: api.example.com
http:
paths:
- path: /api/users
pathType: Prefix
backend:
service:
name: user-service
port:
number: 8080
- path: /api/orders
pathType: Prefix
backend:
service:
name: order-service
port:
number: 8080
Conclusion
API gateways centralize cross-cutting concerns โ routing, authentication, rate limiting, and transformation โ reducing duplication across microservices. Choose Kong for enterprise API management with a rich plugin ecosystem, Traefik for cloud-native Kubernetes environments, AWS API Gateway for serverless architectures, and Envoy for high-performance service mesh deployments. Regardless of which tool you select, implement rate limiting, authentication, request validation, and TLS termination at the gateway layer to keep backend services focused on business logic.
Resources
- Kong Gateway Documentation
- Traefik Proxy Documentation
- AWS API Gateway Developer Guide
- Envoy Proxy Architecture Overview
- NGINX as an API Gateway
- OAuth2 Proxy Integration
- Kong Plugin Hub
Comments