PostgreSQL: The Complete Guide to the World's Most Advanced Open Source Database

Introduction

PostgreSQL, often called “Postgres,” is the world’s most advanced open-source relational database管理系统. With over 35 years of development, PostgreSQL has evolved from a university research project into a robust, feature-rich database that powers applications ranging from small startups to enterprise giants like Uber, Netflix, and Instagram.

In 2026, PostgreSQL continues to lead the database landscape with support for JSON, vectors, full-text search, geospatial data, and enterprise-grade features. This comprehensive guide covers everything you need to get started with PostgreSQL.

What is PostgreSQL?

PostgreSQL is an object-relational database management system (ORDBMS) that emphasizes extensibility and SQL compliance. Unlike simple key-value stores, PostgreSQL provides:

ACID Compliance: Atomicity, Consistency, Isolation, Durability
Rich Data Types: From basic integers to JSON, arrays, and custom types
Advanced Indexing: Multiple index types for different use cases
Extensibility: User-defined functions, types, and operators
Concurrency: Multi-Version Concurrency Control (MVCC)
Replication: Both physical and logical replication

PostgreSQL vs MySQL

Feature	PostgreSQL	MySQL
Data Types	More (JSON, array, range, etc.)	Fewer
Index Types	B-tree, Hash, GiST, GIN, BRIN, etc.	B-tree, Hash, Full-text
MVCC	Full support	Partial (InnoDB)
Subqueries	Full support	Limited
Stored Procedures	Multiple languages	MySQL only
Foreign Keys	Full support	Full support
JSON	Native JSON + JSONB	JSON support
GIS	PostGIS (advanced)	Basic

Installation

Docker Installation (Recommended)

# Start PostgreSQL container
docker run --name postgres \
  -e POSTGRES_PASSWORD=mysecretpassword \
  -e POSTGRES_DB=mydb \
  -v postgres_data:/var/lib/postgresql/data \
  -p 5432:5432 \
  -d postgres:17

# Connect to PostgreSQL
docker exec -it postgres psql -U postgres -d mydb

Linux Installation

# Ubuntu/Debian
sudo apt update
sudo apt install postgresql postgresql-contrib

# Start PostgreSQL
sudo systemctl start postgresql
sudo systemctl enable postgresql

# Connect as postgres user
sudo -u postgres psql

macOS Installation

# Using Homebrew
brew install postgresql@17
brew services start postgresql@17

# Connect
psql -U $(whoami) -d postgres

Windows Installation

Download the installer from https://www.postgresql.org/download/windows/ or use Chocolatey:

choco install postgresql

Basic SQL Operations

Creating Databases and Tables

-- Create database
CREATE DATABASE myapp;

-- Connect to database
\c myapp

-- Create table
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    is_active BOOLEAN DEFAULT TRUE
);

-- Create orders table with foreign key
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INTEGER REFERENCES users(id),
    total_amount DECIMAL(10, 2) NOT NULL,
    status VARCHAR(20) DEFAULT 'pending',
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CRUD Operations

-- INSERT: Create new records
INSERT INTO users (username, email, password_hash)
VALUES ('john_doe', '[email protected]', 'hashed_password');

-- Insert multiple rows
INSERT INTO users (username, email, password_hash)
VALUES 
    ('alice', '[email protected]', 'hash1'),
    ('bob', '[email protected]', 'hash2');

-- INSERT...RETURNING
INSERT INTO users (username, email, password_hash)
VALUES ('new_user', '[email protected]', 'hash')
RETURNING id, username;

-- SELECT: Read data
-- Get all users
SELECT * FROM users;

-- Get specific columns
SELECT username, email FROM users;

-- WHERE clause
SELECT * FROM users WHERE is_active = TRUE;

-- ORDER BY
SELECT * FROM users ORDER BY created_at DESC;

-- LIMIT and OFFSET
SELECT * FROM users LIMIT 10 OFFSET 20;

-- UPDATE: Update records
UPDATE users 
SET email = '[email protected]' 
WHERE username = 'john_doe';

-- Update multiple columns
UPDATE users 
SET email = '[email protected]', is_active = FALSE 
WHERE id = 1;

-- DELETE: Delete records
DELETE FROM users WHERE id = 1;

-- Delete with returning
DELETE FROM users WHERE is_active = FALSE RETURNING username, email;

Query Filtering and Operators

-- Comparison operators
SELECT * FROM users WHERE age >= 18;
SELECT * FROM products WHERE price < 100;

-- IN and NOT IN
SELECT * FROM users WHERE country IN ('USA', 'Canada', 'UK');
SELECT * FROM users WHERE country NOT IN ('China', 'Russia');

-- BETWEEN
SELECT * FROM orders WHERE total_amount BETWEEN 100 AND 500;

-- LIKE and ILIKE (case-insensitive)
SELECT * FROM users WHERE email LIKE '%@gmail.com';
SELECT * FROM users WHERE username ILIKE '%john%';

-- IS NULL and IS NOT NULL
SELECT * FROM users WHERE phone IS NOT NULL;

-- AND, OR, NOT
SELECT * FROM users WHERE is_active = TRUE AND country = 'USA';
SELECT * FROM products WHERE price < 10 OR price > 100;
SELECT * FROM users WHERE NOT is_active;

Data Types

PostgreSQL offers the most comprehensive set of data types among open-source databases.

Numeric Types

-- Integer types
SMALLINT      -- -32,768 to 32,767
INTEGER       -- -2,147,483,648 to 2,147,483,647
BIGINT        -- -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807

-- Floating point
REAL          -- Single precision
DOUBLE PRECISION -- Double precision
DECIMAL(p, s) -- Exact numeric (p=precision, s=scale)
NUMERIC(p, s) -- Same as DECIMAL

-- Examples
CREATE TABLE financials (
    amount DECIMAL(15, 2),  -- 15 digits, 2 after decimal
    rate NUMERIC(5, 4),      -- 0.0000 to 999.99
    quantity INTEGER
);

Character Types

-- Character types
CHAR(n)        -- Fixed length, padded
VARCHAR(n)     -- Variable length, max n
TEXT           -- Unlimited length

-- Examples
CREATE TABLE profiles (
    code CHAR(2),      -- Always 2 characters
    name VARCHAR(100), -- Up to 100 characters
    bio TEXT           -- Unlimited
);

Date/Time Types

-- Date and time types
DATE           -- Date only (2026-03-05)
TIME           -- Time only (14:30:00)
TIMESTAMP      -- Date and time (2026-03-05 14:30:00)
TIMESTAMPTZ    -- Timezone-aware timestamp
INTERVAL       -- Time interval

-- Examples
CREATE TABLE events (
    event_date DATE,
    start_time TIME,
    created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
    duration INTERVAL
);

-- Date/time functions
SELECT CURRENT_DATE;
SELECT CURRENT_TIMESTAMP;
SELECT NOW();
SELECT EXTRACT(YEAR FROM created_at) FROM orders;
SELECT DATE_TRUNC('month', created_at) FROM orders;

JSON Types

-- JSON and JSONB
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    title VARCHAR(200),
    content JSON,
    metadata JSONB
);

-- Insert JSON
INSERT INTO documents (title, content, metadata)
VALUES (
    'Invoice',
    '{"items": [{"qty": 2, "price": 10}]}',
    '{"author": "John", "version": 1}'
);

-- Query JSON
SELECT content->'items' FROM documents;
SELECT content->>'items' FROM documents;  -- As text

-- JSONB operators
SELECT * FROM documents WHERE metadata @> '{"author": "John"}';
SELECT * FROM documents WHERE content ? 'items';

Array Types

-- Array types
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    tags TEXT[],
    prices DECIMAL(10, 2)[]
);

-- Insert arrays
INSERT INTO products (name, tags, prices)
VALUES (
    'Laptop',
    ARRAY['electronics', 'computer', 'sale'],
    ARRAY[999.99, 1299.99]
);

-- Query arrays
SELECT * FROM products WHERE 'electronics' = ANY(tags);
SELECT * FROM products WHERE tags[1] = 'electronics';
SELECT array_length(tags, 1) FROM products;

Range Types

-- Built-in range types
-- int4range, int8range, numrange, tsrange, tstzrange, daterange

CREATE TABLE reservations (
    id SERIAL PRIMARY KEY,
    room_id INTEGER,
    date_range DATERANGE,
    price DECIMAL(10, 2)
);

-- Range operators
SELECT * FROM reservations 
WHERE daterange '[2026-03-01, 2026-03-10)' 
    && daterange '[2026-03-05, 2026-03-15)';

Constraints

Constraints enforce data integrity at the database level.

Primary Key

-- Single column primary key
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100)
);

-- Composite primary key
CREATE TABLE order_items (
    order_id INTEGER,
    product_id INTEGER,
    quantity INTEGER,
    PRIMARY KEY (order_id, product_id)
);

Foreign Key

CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INTEGER REFERENCES users(id),
    total DECIMAL(10, 2)
);

-- With ON DELETE/UPDATE actions
CREATE TABLE order_items (
    order_id INTEGER REFERENCES orders(id) ON DELETE CASCADE,
    product_id INTEGER REFERENCES products(id) ON DELETE RESTRICT,
    quantity INTEGER
);

Unique Constraint

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(100) UNIQUE,
    phone VARCHAR(20)
);

-- Table-level unique constraint
CREATE TABLE user_sessions (
    id SERIAL PRIMARY KEY,
    user_id INTEGER,
    token VARCHAR(255),
    UNIQUE(user_id, token)
);

Check Constraint

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    price DECIMAL(10, 2) CHECK (price > 0),
    quantity INTEGER CHECK (quantity >= 0),
    discount DECIMAL(5, 2) CHECK (discount >= 0 AND discount <= 100)
);

Not Null Constraint

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    email VARCHAR(100) NOT NULL,
    age INTEGER CHECK (age >= 18)
);

Indexes

Indexes improve query performance but slow down writes.

B-Tree Index (Default)

-- Create B-tree index
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_products_price ON products(price DESC);

-- Unique index
CREATE UNIQUE INDEX idx_users_username ON users(username);

Partial Index

-- Index only active users
CREATE INDEX idx_active_users ON users(email) WHERE is_active = TRUE;

-- Index expensive orders
CREATE INDEX idx_large_orders ON orders(total_amount) WHERE total_amount > 1000;

Composite Index

-- Composite index (order matters)
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- Covering index (include additional columns)
CREATE INDEX idx_orders_user_include ON orders(user_id) INCLUDE (total_amount, status);

Specialized Indexes

-- GIN index for JSON/arrays
CREATE INDEX idx_documents_content ON documents USING GIN(content);

-- GiST index for full-text search
CREATE INDEX idx_articles_content ON articles USING GIST(to_tsvector('english', content));

-- BRIN index for sequential data
CREATE INDEX idx_logs_created ON logs USING BRIN(created_at);

-- Hash index
CREATE INDEX idx_users_email_hash ON users USING HASH(email);

Joins

-- INNER JOIN
SELECT o.id, u.username, o.total_amount
FROM orders o
INNER JOIN users u ON o.user_id = u.id;

-- LEFT JOIN
SELECT u.username, o.id
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;

-- RIGHT JOIN
SELECT u.username, o.id
FROM users u
RIGHT JOIN orders o ON u.id = o.user_id;

-- FULL OUTER JOIN
SELECT u.username, o.id
FROM users u
FULL OUTER JOIN orders o ON u.id = o.user_id;

-- Multiple joins
SELECT o.id, u.username, p.name, oi.quantity
FROM orders o
JOIN users u ON o.user_id = u.id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id;

Aggregations and Grouping

-- COUNT, SUM, AVG, MIN, MAX
SELECT COUNT(*) FROM users;
SELECT SUM(total_amount) FROM orders;
SELECT AVG(price) FROM products;
SELECT MIN(created_at) FROM orders;
SELECT MAX(price) FROM products;

-- GROUP BY
SELECT user_id, COUNT(*) as order_count, SUM(total_amount) as total_spent
FROM orders
GROUP BY user_id;

-- HAVING (filter groups)
SELECT user_id, COUNT(*) as order_count
FROM orders
GROUP BY user_id
HAVING COUNT(*) > 5;

-- Multiple aggregations
SELECT 
    status,
    COUNT(*) as count,
    SUM(total_amount) as total,
    AVG(total_amount) as average,
    MIN(total_amount) as min_amount,
    MAX(total_amount) as max_amount
FROM orders
GROUP BY status;

Window Functions

-- ROW_NUMBER
SELECT 
    username,
    ROW_NUMBER() OVER (ORDER BY created_at) as row_num
FROM users;

-- RANK and DENSE_RANK
SELECT 
    product_name,
    sales,
    RANK() OVER (ORDER BY sales DESC) as rank,
    DENSE_RANK() OVER (ORDER BY sales DESC) as dense_rank
FROM products;

-- Running total
SELECT 
    created_at,
    amount,
    SUM(amount) OVER (ORDER BY created_at) as running_total
FROM transactions;

-- Partition by
SELECT 
    department,
    employee_name,
    salary,
    AVG(salary) OVER (PARTITION BY department) as dept_avg
FROM employees;

Subqueries and Common Table Expressions (CTEs)

-- Subquery in WHERE
SELECT * FROM products
WHERE price > (SELECT AVG(price) FROM products);

-- Subquery in FROM (derived table)
SELECT dept_avg.*
FROM (SELECT department, AVG(salary) as avg_salary FROM employees GROUP BY department) dept_avg
WHERE dept_avg.avg_salary > 50000;

-- EXISTS
SELECT * FROM users u
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);

-- CTE (Common Table Expression)
WITH active_orders AS (
    SELECT * FROM orders WHERE status = 'completed'
),
high_value_orders AS (
    SELECT * FROM active_orders WHERE total_amount > 1000
)
SELECT u.username, hvo.*
FROM high_value_orders hvo
JOIN users u ON hvo.user_id = u.id;

-- Recursive CTE
WITH RECURSIVE org_chart AS (
    -- Base case
    SELECT id, name, manager_id, 1 as level
    FROM employees WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Recursive case
    SELECT e.id, e.name, e.manager_id, oc.level + 1
    FROM employees e
    JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart;

Transactions

-- Basic transaction
BEGIN;
INSERT INTO accounts (id, balance) VALUES (1, 1000);
INSERT INTO accounts (id, balance) VALUES (2, 500);
COMMIT;

-- Rollback
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
-- If error occurs:
ROLLBACK;

-- Savepoints
BEGIN;
INSERT INTO orders (total_amount) VALUES (100);
SAVEPOINT order_1;
INSERT INTO order_items (order_id, product_id) VALUES (1, 1);
ROLLBACK TO SAVEPOINT order_1;
COMMIT;

Isolation Levels

-- Set isolation level
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- Or at start
BEGIN ISOLATION LEVEL REPEATABLE READ;

Views

-- Simple view
CREATE VIEW active_users AS
SELECT id, username, email FROM users WHERE is_active = TRUE;

-- Updatable view
CREATE VIEW order_summary AS
SELECT 
    o.id,
    u.username,
    o.total_amount,
    o.status,
    o.created_at
FROM orders o
JOIN users u ON o.user_id = u.id;

-- Materialized view (stores data physically)
CREATE MATERIALIZED VIEW monthly_sales AS
SELECT 
    DATE_TRUNC('month', created_at) as month,
    COUNT(*) as order_count,
    SUM(total_amount) as total_sales
FROM orders
GROUP BY DATE_TRUNC('month', created_at);

-- Refresh materialized view
REFRESH MATERIALIZED VIEW monthly_sales;

Conclusion

PostgreSQL provides a powerful foundation for building any data-driven application. Its rich feature set, combined with excellent performance and reliability, makes it an ideal choice for modern development.

In the next article, we’ll explore PostgreSQL operations: backup strategies, replication, high availability, and monitoring.