Skip to main content
โšก Calmops

Database Design Patterns: Repository, Unit of Work, and Data Mapper

Introduction

Database design patterns provide proven solutions for organizing data access logic in applications. These patterns separate business logic from data access concerns, making code more testable, maintainable, and flexible. Whether you are building a simple CRUD application or a complex enterprise system, understanding these patterns will help you design robust data access layers.

The choice of database pattern affects how easily you can test your code, switch databases, and evolve your schema over time. Poorly designed data access layers become a maintenance burden, with business logic scattered across the codebase and database queries intertwined with application code.

This guide explores the major database design patterns: Repository pattern for abstracting data access, Unit of Work for managing transactions, Data Mapper for complex ORM scenarios, and Active Record for simpler applications.

Repository Pattern

Overview and Principles

The Repository pattern abstracts the data layer, providing a collection-like interface for accessing domain entities. Clients work with repositories as if they were in-memory collections, without knowing where the data comes from or how it is stored.

The key benefits of the Repository pattern include: testability (repositories can be mocked for unit tests), abstraction (clients don’t need to know about SQL, NoSQL, or other storage details), and flexibility (the underlying storage can change without affecting clients).

from abc import ABC, abstractmethod
from typing import List, Optional, Dict, Any
from dataclasses import dataclass
from datetime import datetime

@dataclass
class User:
    id: str
    email: str
    name: str
    created_at: datetime
    is_active: bool = True

class UserRepository(ABC):
    """Abstract repository interface for users."""
    
    @abstractmethod
    def find_by_id(self, user_id: str) -> Optional[User]:
        pass
    
    @abstractmethod
    def find_by_email(self, email: str) -> Optional[User]:
        pass
    
    @abstractmethod
    def find_all(self, limit: int = 100, offset: int = 0) -> List[User]:
        pass
    
    @abstractmethod
    def find_by_criteria(self, criteria: Dict[str, Any]) -> List[User]:
        pass
    
    @abstractmethod
    def save(self, user: User) -> User:
        pass
    
    @abstractmethod
    def update(self, user: User) -> User:
        pass
    
    @abstractmethod
    def delete(self, user_id: str) -> bool:
        pass
    
    @abstractmethod
    def count(self) -> int:
        pass

class SQLUserRepository(UserRepository):
    """SQL implementation of user repository."""
    
    def __init__(self, db_connection):
        self.db = db_connection
    
    def find_by_id(self, user_id: str) -> Optional[User]:
        row = self.db.query_one(
            "SELECT * FROM users WHERE id = ?", user_id
        )
        return self._row_to_user(row) if row else None
    
    def find_by_email(self, email: str) -> Optional[User]:
        row = self.db.query_one(
            "SELECT * FROM users WHERE email = ?", email
        )
        return self._row_to_user(row) if row else None
    
    def find_all(self, limit: int = 100, offset: int = 0) -> List[User]:
        rows = self.db.query_all(
            "SELECT * FROM users ORDER BY created_at DESC LIMIT ? OFFSET ?",
            limit, offset
        )
        return [self._row_to_user(row) for row in rows]
    
    def find_by_criteria(self, criteria: Dict[str, Any]) -> List[User]:
        where_clauses = []
        params = []
        
        for key, value in criteria.items():
            where_clauses.append(f"{key} = ?")
            params.append(value)
        
        where = " AND ".join(where_clauses) if where_clauses else "1=1"
        
        rows = self.db.query_all(
            f"SELECT * FROM users WHERE {where}", *params
        )
        return [self._row_to_user(row) for row in rows]
    
    def save(self, user: User) -> User:
        self.db.execute(
            """INSERT INTO users (id, email, name, created_at, is_active)
               VALUES (?, ?, ?, ?, ?)""",
            user.id, user.email, user.name, user.created_at, user.is_active
        )
        return user
    
    def update(self, user: User) -> User:
        self.db.execute(
            """UPDATE users SET email = ?, name = ?, is_active = ?
               WHERE id = ?""",
            user.email, user.name, user.is_active, user.id
        )
        return user
    
    def delete(self, user_id: str) -> bool:
        result = self.db.execute(
            "DELETE FROM users WHERE id = ?", user_id
        )
        return result > 0
    
    def count(self) -> int:
        return self.db.query_one("SELECT COUNT(*) FROM users")[0]
    
    def _row_to_user(self, row: tuple) -> User:
        return User(
            id=row[0],
            email=row[1],
            name=row[2],
            created_at=row[3],
            is_active=bool(row[4])
        )

class MockUserRepository(UserRepository):
    """In-memory mock repository for testing."""
    
    def __init__(self):
        self._users: Dict[str, User] = {}
    
    def find_by_id(self, user_id: str) -> Optional[User]:
        return self._users.get(user_id)
    
    def find_by_email(self, email: str) -> Optional[User]:
        for user in self._users.values():
            if user.email == email:
                return user
        return None
    
    def find_all(self, limit: int = 100, offset: int = 0) -> List[User]:
        return list(self._users.values())[offset:offset + limit]
    
    def find_by_criteria(self, criteria: Dict[str, Any]) -> List[User]:
        results = []
        for user in self._users.values():
            if all(getattr(user, k) == v for k, v in criteria.items()):
                results.append(user)
        return results
    
    def save(self, user: User) -> User:
        self._users[user.id] = user
        return user
    
    def update(self, user: User) -> User:
        if user.id in self._users:
            self._users[user.id] = user
        return user
    
    def delete(self, user_id: str) -> bool:
        if user_id in self._users:
            del self._users[user_id]
            return True
        return False
    
    def count(self) -> int:
        return len(self._users)

Generic Repository

A generic repository provides common operations for all entities, reducing boilerplate code.

from abc import ABC, abstractmethod
from typing import TypeVar, Generic, List, Optional, Dict, Any
from dataclasses import dataclass

T = TypeVar('T')

@dataclass
class Entity:
    id: str

class GenericRepository(ABC, Generic[T]):
    """Generic repository with common operations."""
    
    @abstractmethod
    def find_by_id(self, entity_id: str) -> Optional[T]:
        pass
    
    @abstractmethod
    def find_all(self, limit: int = 100, offset: int = 0) -> List[T]:
        pass
    
    @abstractmethod
    def find_by_criteria(self, criteria: Dict[str, Any]) -> List[T]:
        pass
    
    @abstractmethod
    def save(self, entity: T) -> T:
        pass
    
    @abstractmethod
    def update(self, entity: T) -> T:
        pass
    
    @abstractmethod
    def delete(self, entity_id: str) -> bool:
        pass
    
    @abstractmethod
    def count(self) -> int:
        pass

class SQLGenericRepository(GenericRepository[T]):
    """Generic SQL repository implementation."""
    
    def __init__(self, db_connection, entity_type: type, table_name: str):
        self.db = db_connection
        self.entity_type = entity_type
        self.table_name = table_name
    
    def find_by_id(self, entity_id: str) -> Optional[T]:
        row = self.db.query_one(
            f"SELECT * FROM {self.table_name} WHERE id = ?", entity_id
        )
        return self._row_to_entity(row) if row else None
    
    def find_all(self, limit: int = 100, offset: int = 0) -> List[T]:
        rows = self.db.query_all(
            f"SELECT * FROM {self.table_name} LIMIT ? OFFSET ?", limit, offset
        )
        return [self._row_to_entity(row) for row in rows]
    
    def find_by_criteria(self, criteria: Dict[str, Any]) -> List[T]:
        where_clauses = []
        params = []
        
        for key, value in criteria.items():
            where_clauses.append(f"{key} = ?")
            params.append(value)
        
        where = " AND ".join(where_clauses) if where_clauses else "1=1"
        
        rows = self.db.query_all(
            f"SELECT * FROM {self.table_name} WHERE {where}", *params
        )
        return [self._row_to_entity(row) for row in rows]
    
    def save(self, entity: T) -> T:
        # Implementation depends on entity structure
        pass
    
    def update(self, entity: T) -> T:
        pass
    
    def delete(self, entity_id: str) -> bool:
        result = self.db.execute(
            f"DELETE FROM {self.table_name} WHERE id = ?", entity_id
        )
        return result > 0
    
    def count(self) -> int:
        return self.db.query_one(f"SELECT COUNT(*) FROM {self.table_name}")[0]
    
    def _row_to_entity(self, row: tuple) -> T:
        # Override in subclass for specific entity mapping
        raise NotImplementedError

Unit of Work Pattern

Managing Transactions

The Unit of Work pattern tracks changes to objects and persists all changes atomically. It ensures that related operations either all succeed or all fail together, maintaining data integrity.

from abc import ABC, abstractmethod
from typing import Dict, Any
import threading

class IUnitOfWork(ABC):
    """Unit of Work interface."""
    
    @abstractmethod
    def begin(self) -> None:
        pass
    
    @abstractmethod
    def commit(self) -> None:
        pass
    
    @abstractmethod
    def rollback(self) -> None:
        pass
    
    @abstractmethod
    def repositories(self) -> Dict[str, Any]:
        pass

class SQLUnitOfWork(IUnitOfWork):
    """SQL implementation of Unit of Work."""
    
    def __init__(self, db_connection):
        self.db = db_connection
        self._transaction = None
        self._repositories = {}
    
    def begin(self) -> None:
        if self._transaction is None:
            self._transaction = self.db.transaction()
    
    def commit(self) -> None:
        if self._transaction:
            self._transaction.commit()
            self._transaction = None
    
    def rollback(self) -> None:
        if self._transaction:
            self._transaction.rollback()
            self._transaction = None
    
    def repositories(self) -> Dict[str, Any]:
        return self._repositories
    
    def __enter__(self):
        self.begin()
        return self
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        if exc_type:
            self.rollback()
        else:
            self.commit()

class UnitOfWorkManager:
    """Manages Unit of Work instances with thread-local storage."""
    
    def __init__(self, db_connection):
        self.db = db_connection
        self._local = threading.local()
    
    def get_current(self) -> IUnitOfWork:
        if not hasattr(self._local, 'uow'):
            self._local.uow = SQLUnitOfWork(self.db)
        return self._local.uow
    
    def begin(self):
        return self.get_current().begin()
    
    def commit(self):
        return self.get_current().commit()
    
    def rollback(self):
        return self.get_current().rollback()

Using Unit of Work with Repositories

class UserService:
    """User service using Unit of Work."""
    
    def __init__(self, uow_manager: UnitOfWorkManager):
        self.uow_manager = uow_manager
    
    def create_user(self, email: str, name: str) -> User:
        uow = self.uow_manager.get_current()
        
        try:
            # Check if user exists
            user_repo = uow.repositories().get('users')
            if user_repo.find_by_email(email):
                raise ValueError("User already exists")
            
            # Create user
            user = User(
                id=str(uuid.uuid4()),
                email=email,
                name=name,
                created_at=datetime.utcnow()
            )
            user_repo.save(user)
            
            return user
        except Exception:
            uow.rollback()
            raise
    
    def deactivate_user(self, user_id: str) -> bool:
        uow = self.uow_manager.get_current()
        
        try:
            user_repo = uow.repositories().get('users')
            user = user_repo.find_by_id(user_id)
            
            if not user:
                return False
            
            user.is_active = False
            user_repo.update(user)
            
            return True
        except Exception:
            uow.rollback()
            raise
    
    def transfer_data(self, from_user_id: str, to_user_id: str):
        """Example of complex transaction across multiple operations."""
        uow = self.uow_manager.get_current()
        
        try:
            user_repo = uow.repositories().get('users')
            order_repo = uow.repositories().get('orders')
            
            from_user = user_repo.find_by_id(from_user_id)
            to_user = user_repo.find_by_id(to_user_id)
            
            if not from_user or not to_user:
                raise ValueError("User not found")
            
            # Transfer orders
            orders = order_repo.find_by_criteria({'user_id': from_user_id})
            for order in orders:
                order.user_id = to_user_id
                order_repo.update(order)
            
            # Deactivate source user
            from_user.is_active = False
            user_repo.update(from_user)
            
            uow.commit()
        except Exception:
            uow.rollback()
            raise

Data Mapper Pattern

Separating Domain from Database

The Data Mapper pattern moves data between domain objects and databases while keeping them independent. Unlike Active Record, domain objects have no knowledge of the database.

from abc import ABC, abstractmethod
from typing import List, Dict, Any

class UserMapper:
    """Maps between User domain objects and database rows."""
    
    def __init__(self, db_connection):
        self.db = db_connection
        self.table_name = "users"
    
    def insert(self, user: User) -> User:
        """Insert a new user."""
        self.db.execute(
            f"""INSERT INTO {self.table_name}
                (id, email, name, password_hash, created_at, updated_at, is_active)
                VALUES (?, ?, ?, ?, ?, ?, ?)""",
            user.id,
            user.email,
            user.name,
            user.password_hash,
            user.created_at,
            user.updated_at,
            user.is_active
        )
        return user
    
    def update(self, user: User) -> User:
        """Update an existing user."""
        self.db.execute(
            f"""UPDATE {self.table_name}
                SET email = ?, name = ?, password_hash = ?,
                    updated_at = ?, is_active = ?
                WHERE id = ?""",
            user.email,
            user.name,
            user.password_hash,
            user.updated_at,
            user.is_active,
            user.id
        )
        return user
    
    def delete(self, user_id: str) -> bool:
        """Soft delete a user."""
        result = self.db.execute(
            f"UPDATE {self.table_name} SET is_active = FALSE WHERE id = ?",
            user_id
        )
        return result > 0
    
    def find_by_id(self, user_id: str) -> User:
        """Find user by ID."""
        row = self.db.query_one(
            f"SELECT * FROM {self.table_name} WHERE id = ?", user_id
        )
        return self._map_row_to_user(row) if row else None
    
    def find_by_email(self, email: str) -> User:
        """Find user by email."""
        row = self.db.query_one(
            f"SELECT * FROM {self.table_name} WHERE email = ?", email
        )
        return self._map_row_to_user(row) if row else None
    
    def find_all(self, limit: int = 100, offset: int = 0) -> List[User]:
        """Find all users with pagination."""
        rows = self.db.query_all(
            f"SELECT * FROM {self.table_name} ORDER BY created_at DESC LIMIT ? OFFSET ?",
            limit, offset
        )
        return [self._map_row_to_user(row) for row in rows]
    
    def _map_row_to_user(self, row: tuple) -> User:
        """Map database row to User domain object."""
        return User(
            id=row[0],
            email=row[1],
            name=row[2],
            password_hash=row[3],
            created_at=row[4],
            updated_at=row[5],
            is_active=bool(row[6])
        )

Active Record Pattern

Simpler Approach for Smaller Applications

The Active Record pattern combines domain object and data access logic in the same class. Each object knows how to save, update, and delete itself from the database.

class ActiveRecordUser:
    """User with Active Record pattern."""
    
    def __init__(
        self,
        id: str = None,
        email: str = None,
        name: str = None,
        created_at: datetime = None,
        is_active: bool = True
    ):
        self.id = id or str(uuid.uuid4())
        self.email = email
        self.name = name
        self.created_at = created_at or datetime.utcnow()
        self.is_active = is_active
        self._db = None
    
    @classmethod
    def set_database(cls, db_connection):
        cls._db = db_connection
    
    def save(self) -> "ActiveRecordUser":
        """Save or update the user."""
        if self._db is None:
            raise RuntimeError("Database not configured")
        
        existing = self._find_by_id(self.id)
        
        if existing:
            self._update()
        else:
            self._insert()
        
        return self
    
    def _insert(self) -> None:
        self._db.execute(
            """INSERT INTO users (id, email, name, created_at, is_active)
               VALUES (?, ?, ?, ?, ?)""",
            self.id, self.email, self.name, self.created_at, self.is_active
        )
    
    def _update(self) -> None:
        self._db.execute(
            """UPDATE users SET email = ?, name = ?, is_active = ?
               WHERE id = ?""",
            self.email, self.name, self.is_active, self.id
        )
    
    def delete(self) -> bool:
        """Delete the user."""
        if self._db is None:
            raise RuntimeError("Database not configured")
        
        result = self._db.execute(
            "DELETE FROM users WHERE id = ?", self.id
        )
        return result > 0
    
    @classmethod
    def find_by_id(cls, user_id: str) -> "ActiveRecordUser":
        if cls._db is None:
            raise RuntimeError("Database not configured")
        
        row = cls._db.query_one(
            "SELECT * FROM users WHERE id = ?", user_id
        )
        return cls._from_row(row) if row else None
    
    @classmethod
    def find_by_email(cls, email: str) -> "ActiveRecordUser":
        if cls._db is None:
            raise RuntimeError("Database not configured")
        
        row = cls._db.query_one(
            "SELECT * FROM users WHERE email = ?", email
        )
        return cls._from_row(row) if row else None
    
    @classmethod
    def all(cls, limit: int = 100) -> List["ActiveRecordUser"]:
        if cls._db is None:
            raise RuntimeError("Database not configured")
        
        rows = cls._db.query_all(
            "SELECT * FROM users LIMIT ?", limit
        )
        return [cls._from_row(row) for row in rows]
    
    @classmethod
    def _from_row(cls, row: tuple) -> "ActiveRecordUser":
        return cls(
            id=row[0],
            email=row[1],
            name=row[2],
            created_at=row[3],
            is_active=bool(row[4])
        )

Conclusion

Database design patterns provide essential tools for building maintainable data access layers. The Repository pattern abstracts data storage, making code testable and flexible. The Unit of Work pattern ensures transactional integrity across operations. The Data Mapper pattern separates domain logic from persistence concerns. The Active Record pattern offers simplicity for smaller applications.

Choose the pattern that fits your application’s complexity and team experience. For complex enterprise applications, Repository + Unit of Work + Data Mapper provides the best separation of concerns. For simpler applications, Active Record can reduce boilerplate while still providing testability.

Resources

  • “Patterns of Enterprise Application Architecture” by Martin Fowler
  • “Domain-Driven Design” by Eric Evans
  • “Architecture Patterns with Python” by Harry Percival and Bob Gregory

Comments