Introduction
Opening a database connection is expensive. A typical PostgreSQL connection takes 50-100ms to establish — time spent on TCP handshake, authentication, session initialization, and memory allocation. For a web application handling 1,000 requests per second, creating a new connection for each request would consume 50-100 seconds of CPU time per second, which is physically impossible.
Connection pooling solves this by maintaining a pool of open, reusable connections. When your application needs a database connection, it borrows one from the pool. When done, it returns the connection to the pool instead of closing it. This reduces connection overhead from 50-100ms to less than 1ms.
Yet most teams misconfigure connection pools, setting arbitrary limits that cause either connection exhaustion (requests waiting for connections) or resource waste (idle connections consuming memory). This guide covers the math, configuration patterns, and common pitfalls.
Why Connection Pooling Matters
The Cost of Database Connections
Each database connection consumes resources on both the application and database server:
Application Side:
- Thread blocked waiting for connection establishment
- Memory for connection object and buffers
- Network socket
Database Side (PostgreSQL example):
- Dedicated backend process (10-20MB memory)
- Shared memory for connection state
- CPU for context switching between processes
# Check PostgreSQL connection memory usage
ps aux | grep postgres | awk '{sum+=$6} END {print "Total memory: " sum/1024 " MB"}'
# With 100 connections, expect 1-2GB memory usage
Performance Impact
import time
import psycopg2
# Without pooling: create connection per request
def query_without_pool():
start = time.time()
conn = psycopg2.connect(
host="localhost",
database="mydb",
user="user",
password="pass"
)
cursor = conn.cursor()
cursor.execute("SELECT 1")
cursor.fetchone()
conn.close()
return time.time() - start
# With pooling: reuse connection
from psycopg2 import pool
connection_pool = pool.SimpleConnectionPool(
minconn=5,
maxconn=20,
host="localhost",
database="mydb",
user="user",
password="pass"
)
def query_with_pool():
start = time.time()
conn = connection_pool.getconn()
cursor = conn.cursor()
cursor.execute("SELECT 1")
cursor.fetchone()
connection_pool.putconn(conn)
return time.time() - start
# Results:
# Without pool: 50-100ms per query
# With pool: 1-5ms per query
Connection Pool Architecture
flowchart TB
A[Application Thread 1] -->|borrow| B[Connection Pool]
C[Application Thread 2] -->|borrow| B
D[Application Thread 3] -->|borrow| B
E[Application Thread 4] -->|wait| B
B -->|connection 1| F[Database]
B -->|connection 2| F
B -->|connection 3| F
style E fill:#ff9999
style B fill:#99ccff
Key Components:
- Pool Manager: Tracks available and in-use connections
- Connection Validation: Tests connections before lending them
- Idle Connection Eviction: Closes connections idle too long
- Connection Leak Detection: Warns when connections aren’t returned
- Wait Queue: Holds threads waiting for connections
Connection Pool Sizing
The most common mistake is setting pool size too high. More connections don’t mean better performance — they mean more context switching and memory usage.
The Formula
For a single application instance:
pool_size = ((core_count * 2) + effective_spindle_count)
Example: 4-core server with SSD (spindle count = 1)
pool_size = (4 * 2) + 1 = 9 connections
For PostgreSQL specifically, a good starting point is:
pool_size = (number_of_cpu_cores * 2) + 1
Why Not More?
PostgreSQL uses a process-per-connection model. With 100 active connections on a 4-core server, the kernel spends more time context switching than executing queries.
-- Check active connections
SELECT count(*) FROM pg_stat_activity WHERE state = 'active';
-- If this number consistently exceeds (cores * 2), you have too many connections
Multi-Instance Scaling
With multiple application instances, the total connections to the database is:
total_connections = instances * pool_size_per_instance
Example: 10 application instances, 10 connections each = 100 total database connections
If your database can’t handle this, use a connection pooler like PgBouncer between applications and database.
Application-Level Pooling
HikariCP (Java)
HikariCP is the fastest JDBC connection pool, used by default in Spring Boot 2.0+.
import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;
public class DatabaseConfig {
public static HikariDataSource createDataSource() {
HikariConfig config = new HikariConfig();
// Connection settings
config.setJdbcUrl("jdbc:postgresql://localhost:5432/mydb");
config.setUsername("user");
config.setPassword("password");
// Pool sizing
config.setMaximumPoolSize(10); // Max connections
config.setMinimumIdle(5); // Min idle connections
// Connection timeout
config.setConnectionTimeout(30000); // 30s wait for connection
config.setIdleTimeout(600000); // 10min idle before eviction
config.setMaxLifetime(1800000); // 30min max connection lifetime
// Connection testing
config.setConnectionTestQuery("SELECT 1");
config.setValidationTimeout(5000); // 5s validation timeout
// Leak detection
config.setLeakDetectionThreshold(60000); // Warn if connection held >60s
// Performance tuning
config.addDataSourceProperty("cachePrepStmts", "true");
config.addDataSourceProperty("prepStmtCacheSize", "250");
config.addDataSourceProperty("prepStmtCacheSqlLimit", "2048");
return new HikariDataSource(config);
}
}
// Usage
public class UserRepository {
private final HikariDataSource dataSource;
public User findById(Long id) throws SQLException {
// Borrow connection from pool
try (Connection conn = dataSource.getConnection();
PreparedStatement stmt = conn.prepareStatement(
"SELECT * FROM users WHERE id = ?")) {
stmt.setLong(1, id);
ResultSet rs = stmt.executeQuery();
if (rs.next()) {
return new User(
rs.getLong("id"),
rs.getString("name"),
rs.getString("email")
);
}
return null;
}
// Connection automatically returned to pool
}
}
Spring Boot Configuration
# application.yml
spring:
datasource:
url: jdbc:postgresql://localhost:5432/mydb
username: user
password: password
hikari:
maximum-pool-size: 10
minimum-idle: 5
connection-timeout: 30000
idle-timeout: 600000
max-lifetime: 1800000
connection-test-query: SELECT 1
leak-detection-threshold: 60000
# PostgreSQL-specific optimizations
data-source-properties:
cachePrepStmts: true
prepStmtCacheSize: 250
prepStmtCacheSqlLimit: 2048
useServerPrepStmts: true
Python with psycopg2
from psycopg2 import pool
from contextlib import contextmanager
import logging
class DatabasePool:
def __init__(self, minconn=5, maxconn=20):
self.pool = pool.ThreadedConnectionPool(
minconn=minconn,
maxconn=maxconn,
host="localhost",
database="mydb",
user="user",
password="password",
# Connection options
connect_timeout=10,
options="-c statement_timeout=30000" # 30s query timeout
)
self.logger = logging.getLogger(__name__)
@contextmanager
def get_connection(self):
"""Context manager for safe connection handling"""
conn = None
try:
conn = self.pool.getconn()
yield conn
except pool.PoolError as e:
self.logger.error(f"Pool exhausted: {e}")
raise
finally:
if conn:
self.pool.putconn(conn)
def close_all(self):
"""Close all connections in pool"""
self.pool.closeall()
# Usage
db_pool = DatabasePool(minconn=5, maxconn=20)
def get_user(user_id):
with db_pool.get_connection() as conn:
with conn.cursor() as cursor:
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
return cursor.fetchone()
# Cleanup on shutdown
import atexit
atexit.register(db_pool.close_all)
Node.js with pg
const { Pool } = require('pg');
const pool = new Pool({
host: 'localhost',
database: 'mydb',
user: 'user',
password: 'password',
// Pool configuration
max: 20, // Maximum connections
min: 5, // Minimum idle connections
idleTimeoutMillis: 600000, // 10min idle timeout
connectionTimeoutMillis: 30000, // 30s connection timeout
// Connection options
statement_timeout: 30000, // 30s query timeout
query_timeout: 30000,
// Error handling
allowExitOnIdle: true
});
// Connection error handling
pool.on('error', (err, client) => {
console.error('Unexpected error on idle client', err);
process.exit(-1);
});
// Usage with async/await
async function getUser(userId) {
const client = await pool.connect();
try {
const result = await client.query(
'SELECT * FROM users WHERE id = $1',
[userId]
);
return result.rows[0];
} finally {
client.release(); // Return to pool
}
}
// Or use pool.query() for simple queries
async function getUserSimple(userId) {
const result = await pool.query(
'SELECT * FROM users WHERE id = $1',
[userId]
);
return result.rows[0];
}
// Graceful shutdown
process.on('SIGTERM', async () => {
await pool.end();
process.exit(0);
});
Server-Side Pooling
PgBouncer
PgBouncer is a lightweight connection pooler that sits between your application and PostgreSQL. It’s essential for microservices architectures where many application instances connect to one database.
flowchart LR
A[App Instance 1<br/>10 connections] --> B[PgBouncer<br/>100 client connections<br/>20 server connections]
C[App Instance 2<br/>10 connections] --> B
D[App Instance 3<br/>10 connections] --> B
E[App Instance 4<br/>10 connections] --> B
F[App Instance 5<br/>10 connections] --> B
B --> G[PostgreSQL<br/>20 connections]
Configuration (pgbouncer.ini):
[databases]
mydb = host=localhost port=5432 dbname=mydb
[pgbouncer]
# Connection limits
max_client_conn = 1000 # Max client connections
default_pool_size = 20 # Connections per database
reserve_pool_size = 5 # Emergency reserve
reserve_pool_timeout = 3 # Wait 3s for reserve pool
# Pooling mode
pool_mode = transaction # transaction | session | statement
# Connection lifetime
server_lifetime = 3600 # Close server connection after 1h
server_idle_timeout = 600 # Close idle server connection after 10min
# Authentication
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
# Logging
log_connections = 1
log_disconnections = 1
log_pooler_errors = 1
# Performance
max_prepared_statements = 100
Pooling Modes:
-
Session Mode: Client gets same server connection for entire session
- Use for: Applications that use session-level features (temp tables, prepared statements)
- Pooling efficiency: Low
-
Transaction Mode: Server connection returned after each transaction
- Use for: Most web applications
- Pooling efficiency: High
- Limitation: Can’t use session-level features
-
Statement Mode: Server connection returned after each statement
- Use for: Simple read-only queries
- Pooling efficiency: Highest
- Limitation: No transactions
Deployment:
# Install PgBouncer
sudo apt-get install pgbouncer
# Start PgBouncer
pgbouncer -d /etc/pgbouncer/pgbouncer.ini
# Connect through PgBouncer (port 6432 by default)
psql -h localhost -p 6432 -U user mydb
# Monitor PgBouncer
psql -h localhost -p 6432 -U pgbouncer pgbouncer
SHOW POOLS;
SHOW STATS;
SHOW SERVERS;
pgpool-II
pgpool-II provides connection pooling plus load balancing, replication, and query caching.
# pgpool.conf
listen_addresses = '*'
port = 5432
# Connection pooling
num_init_children = 32 # Max concurrent connections
max_pool = 4 # Connections per child process
child_life_time = 300 # Child process lifetime (5min)
connection_life_time = 0 # Connection lifetime (0 = unlimited)
client_idle_limit = 0 # Client idle timeout (0 = unlimited)
# Load balancing
load_balance_mode = on
backend_hostname0 = 'db1.example.com'
backend_port0 = 5432
backend_weight0 = 1
backend_hostname1 = 'db2.example.com'
backend_port1 = 5432
backend_weight1 = 1
# Health checking
health_check_period = 10 # Check every 10s
health_check_timeout = 20 # Timeout after 20s
health_check_max_retries = 3 # Retry 3 times
Best Practices
1. Size Pools Correctly
Don’t: Set pool size to 100 “just to be safe”
# Bad: Way too many connections
hikari:
maximum-pool-size: 100
Do: Calculate based on cores and measure
# Good: Sized for 4-core server
hikari:
maximum-pool-size: 10
minimum-idle: 5
2. Set Connection Timeouts
Don’t: Wait forever for a connection
// Bad: No timeout, threads block indefinitely
config.setConnectionTimeout(0);
Do: Fail fast with reasonable timeout
// Good: Fail after 30s
config.setConnectionTimeout(30000);
// Even better: Fail fast and alert
config.setConnectionTimeout(5000); // 5s timeout
3. Validate Connections
Don’t: Assume connections are always valid
// Bad: No validation
config.setConnectionTestQuery(null);
Do: Test connections before use
// Good: Lightweight validation query
config.setConnectionTestQuery("SELECT 1");
config.setValidationTimeout(5000);
4. Detect Connection Leaks
Don’t: Let connections leak silently
// Bad: No leak detection
config.setLeakDetectionThreshold(0);
Do: Warn about long-held connections
// Good: Warn if connection held >60s
config.setLeakDetectionThreshold(60000);
// Log the stack trace of where connection was obtained
config.setRegisterMbeans(true);
5. Set Maximum Connection Lifetime
Don’t: Keep connections open forever
// Bad: Connections never recycled
config.setMaxLifetime(0);
Do: Recycle connections periodically
// Good: Recycle after 30min
config.setMaxLifetime(1800000);
// Should be less than database's connection timeout
// PostgreSQL default: 10 hours
6. Monitor Pool Metrics
import com.zaxxer.hikari.HikariPoolMXBean;
import javax.management.MBeanServer;
import javax.management.ObjectName;
import java.lang.management.ManagementFactory;
public class PoolMonitor {
public void logPoolStats(HikariDataSource dataSource) {
HikariPoolMXBean poolBean = dataSource.getHikariPoolMXBean();
System.out.println("Active connections: " + poolBean.getActiveConnections());
System.out.println("Idle connections: " + poolBean.getIdleConnections());
System.out.println("Total connections: " + poolBean.getTotalConnections());
System.out.println("Threads waiting: " + poolBean.getThreadsAwaitingConnection());
}
}
Common Pitfalls
1. Pool Size Too Large
Problem: 50 application instances, each with 20 connections = 1,000 database connections
# Bad: Each instance has 20 connections
hikari:
maximum-pool-size: 20
Solution: Use PgBouncer to multiplex connections
# Good: Each instance has 5 connections
hikari:
maximum-pool-size: 5
# PgBouncer multiplexes to 50 database connections
2. Not Returning Connections
Problem: Forgetting to close connections causes pool exhaustion
// Bad: Connection never returned
public User getUser(Long id) throws SQLException {
Connection conn = dataSource.getConnection();
PreparedStatement stmt = conn.prepareStatement("SELECT * FROM users WHERE id = ?");
stmt.setLong(1, id);
ResultSet rs = stmt.executeQuery();
// Connection leaked!
return mapUser(rs);
}
Solution: Use try-with-resources
// Good: Connection automatically returned
public User getUser(Long id) throws SQLException {
try (Connection conn = dataSource.getConnection();
PreparedStatement stmt = conn.prepareStatement("SELECT * FROM users WHERE id = ?")) {
stmt.setLong(1, id);
ResultSet rs = stmt.executeQuery();
return mapUser(rs);
}
}
3. Mixing Pooling Layers
Problem: Application pool + PgBouncer + pgpool-II = confusion
App (20 connections) → PgBouncer (100 connections) → pgpool-II (50 connections) → PostgreSQL
Solution: Choose one pooling strategy
# Option 1: Application-level only (small scale)
App (10 connections) → PostgreSQL
# Option 2: Application + PgBouncer (microservices)
App (5 connections) → PgBouncer (20 connections) → PostgreSQL
# Option 3: PgBouncer only (legacy apps)
App (no pool) → PgBouncer (20 connections) → PostgreSQL
4. Ignoring Connection Validation
Problem: Stale connections cause query failures
# Bad: No validation
conn = pool.getconn()
cursor = conn.cursor()
cursor.execute("SELECT * FROM users") # Fails if connection died
Solution: Validate before use or handle errors
# Good: Validate connection
conn = pool.getconn()
try:
cursor = conn.cursor()
cursor.execute("SELECT 1") # Validate
cursor.execute("SELECT * FROM users")
except psycopg2.OperationalError:
# Connection dead, get new one
pool.putconn(conn, close=True)
conn = pool.getconn()
cursor = conn.cursor()
cursor.execute("SELECT * FROM users")
5. Not Monitoring Pool Exhaustion
Problem: Pool exhaustion causes request timeouts, but you don’t know why
Solution: Export pool metrics to monitoring system
// Micrometer metrics for HikariCP
@Configuration
public class MetricsConfig {
@Bean
public MeterBinder hikariMetrics(HikariDataSource dataSource) {
return new HikariDataSourceMetrics(dataSource, "hikari", Collections.emptyList());
}
}
// Metrics available:
// - hikaricp.connections.active
// - hikaricp.connections.idle
// - hikaricp.connections.pending
// - hikaricp.connections.timeout
Testing Connection Pools
Load Testing
import concurrent.futures
import time
from database_pool import DatabasePool
def query_user(user_id):
start = time.time()
try:
with db_pool.get_connection() as conn:
with conn.cursor() as cursor:
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
cursor.fetchone()
return time.time() - start, None
except Exception as e:
return time.time() - start, str(e)
# Simulate 100 concurrent requests
db_pool = DatabasePool(minconn=5, maxconn=20)
with concurrent.futures.ThreadPoolExecutor(max_workers=100) as executor:
futures = [executor.submit(query_user, i % 1000) for i in range(100)]
results = [f.result() for f in futures]
# Analyze results
times = [r[0] for r in results if r[1] is None]
errors = [r[1] for r in results if r[1] is not None]
print(f"Successful queries: {len(times)}")
print(f"Failed queries: {len(errors)}")
print(f"Average time: {sum(times) / len(times):.3f}s")
print(f"Max time: {max(times):.3f}s")
Resources
- HikariCP GitHub — Fast JDBC connection pool
- PgBouncer Documentation — Lightweight PostgreSQL connection pooler
- pgpool-II Documentation — PostgreSQL connection pooler with load balancing
- PostgreSQL Connection Pooling — Official PostgreSQL connection settings
- Azure: Connection Pooling Best Practices — Cloud-specific guidance
Comments