Skip to main content
โšก Calmops

Descriptors and Metaclasses in Python: Advanced OOP Mastery

Descriptors and metaclasses are among Python’s most powerful and misunderstood features. They’re the mechanisms that power frameworks like Django, SQLAlchemy, and Pydantic. Understanding them transforms you from someone who uses these frameworks to someone who can build similar abstractions.

This guide demystifies these advanced concepts by building from fundamentals to real-world applications.

Part 1: Descriptors

What Are Descriptors?

A descriptor is an object that implements the descriptor protocolโ€”a set of special methods that control how attributes are accessed. When you access an attribute on an object, Python’s attribute lookup mechanism checks if that attribute is a descriptor and, if so, calls its special methods.

The descriptor protocol consists of three methods:

class Descriptor:
    def __get__(self, obj, objtype=None):
        """Called when the attribute is accessed"""
        pass
    
    def __set__(self, obj, value):
        """Called when the attribute is assigned"""
        pass
    
    def __delete__(self, obj):
        """Called when the attribute is deleted"""
        pass

An object is a descriptor if it implements at least one of these methods. If it implements __set__ or __delete__, it’s a data descriptor. If it only implements __get__, it’s a non-data descriptor.

Why Descriptors Exist

Descriptors solve a fundamental problem: how do you intercept attribute access? Without descriptors, you’d need to override __getattr__ and __setattr__ on every class, which is inefficient and error-prone.

Descriptors are used throughout Python:

  • property is a descriptor
  • staticmethod and classmethod are descriptors
  • Methods are descriptors
  • Django model fields are descriptors

The Descriptor Protocol in Detail

Understanding __get__

The __get__ method is called when an attribute is accessed. It receives three parameters:

def __get__(self, obj, objtype=None):
    """
    obj: The instance through which the descriptor was accessed (None if accessed on class)
    objtype: The type of the instance
    """
class Descriptor:
    def __get__(self, obj, objtype=None):
        print(f"__get__ called: obj={obj}, objtype={objtype}")
        return "descriptor value"

class MyClass:
    attr = Descriptor()

# Accessing through instance
instance = MyClass()
print(instance.attr)  # __get__ called: obj=<MyClass object>, objtype=<class 'MyClass'>

# Accessing through class
print(MyClass.attr)   # __get__ called: obj=None, objtype=<class 'MyClass'>

Understanding __set__

The __set__ method is called when an attribute is assigned. It receives the instance and the value being assigned:

class ValidatedDescriptor:
    def __init__(self, name):
        self.name = name
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return obj.__dict__.get(self.name, None)
    
    def __set__(self, obj, value):
        if not isinstance(value, int):
            raise TypeError(f"{self.name} must be an integer")
        obj.__dict__[self.name] = value

class Person:
    age = ValidatedDescriptor('age')
    
    def __init__(self, name, age):
        self.name = name
        self.age = age  # Calls __set__

person = Person("Alice", 30)
print(person.age)  # Calls __get__

person.age = 31    # Valid
person.age = "thirty"  # Raises TypeError

Understanding __delete__

The __delete__ method is called when an attribute is deleted:

class DeletableDescriptor:
    def __init__(self, name):
        self.name = name
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return obj.__dict__.get(self.name)
    
    def __set__(self, obj, value):
        obj.__dict__[self.name] = value
    
    def __delete__(self, obj):
        print(f"Deleting {self.name}")
        del obj.__dict__[self.name]

class Document:
    content = DeletableDescriptor('content')

doc = Document()
doc.content = "Hello"
print(doc.content)  # Hello
del doc.content     # Deleting content
print(doc.content)  # None

Data Descriptors vs Non-Data Descriptors

The distinction matters for attribute lookup order:

Data descriptors (implement __set__ or __delete__) take precedence over instance attributes.

Non-data descriptors (only implement __get__) are overridden by instance attributes.

class DataDescriptor:
    def __get__(self, obj, objtype=None):
        return "data descriptor"
    
    def __set__(self, obj, value):
        pass

class NonDataDescriptor:
    def __get__(self, obj, objtype=None):
        return "non-data descriptor"

class Example:
    data_desc = DataDescriptor()
    non_data_desc = NonDataDescriptor()

example = Example()

# Data descriptor takes precedence
example.__dict__['data_desc'] = "instance value"
print(example.data_desc)  # Output: data descriptor

# Non-data descriptor is overridden
example.__dict__['non_data_desc'] = "instance value"
print(example.non_data_desc)  # Output: instance value

Practical Descriptor Examples

Example 1: Lazy Loading

class LazyProperty:
    """Load expensive data only when accessed"""
    def __init__(self, func):
        self.func = func
        self.name = func.__name__
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        
        # Compute value and cache it
        value = self.func(obj)
        # Store in instance dict to avoid calling __get__ again
        obj.__dict__[self.name] = value
        return value

class DataModel:
    def __init__(self, data_file):
        self.data_file = data_file
    
    @LazyProperty
    def data(self):
        """Load data from file only when accessed"""
        print(f"Loading data from {self.data_file}...")
        # Simulate expensive operation
        return {"records": 1000, "size": "10MB"}

# Usage
model = DataModel("data.csv")
print("Model created")
print(model.data)  # Loads data
print(model.data)  # Returns cached value (no print statement)

Example 2: Type Validation

class TypedProperty:
    """Enforce type checking on assignment"""
    def __init__(self, name, expected_type):
        self.name = name
        self.expected_type = expected_type
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return obj.__dict__.get(self.name)
    
    def __set__(self, obj, value):
        if not isinstance(value, self.expected_type):
            raise TypeError(
                f"{self.name} must be {self.expected_type.__name__}, "
                f"got {type(value).__name__}"
            )
        obj.__dict__[self.name] = value

class Product:
    name = TypedProperty('name', str)
    price = TypedProperty('price', (int, float))
    quantity = TypedProperty('quantity', int)
    
    def __init__(self, name, price, quantity):
        self.name = name
        self.price = price
        self.quantity = quantity

# Usage
product = Product("Laptop", 999.99, 5)
product.price = 899.99  # Valid
product.quantity = "five"  # Raises TypeError

Example 3: Computed Properties with Caching

class CachedProperty:
    """Cache computed property value"""
    def __init__(self, func):
        self.func = func
        self.name = func.__name__
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        
        # Check cache
        cache_attr = f'_cache_{self.name}'
        if not hasattr(obj, cache_attr):
            # Compute and cache
            value = self.func(obj)
            setattr(obj, cache_attr, value)
        
        return getattr(obj, cache_attr)

class Rectangle:
    def __init__(self, width, height):
        self.width = width
        self.height = height
    
    @CachedProperty
    def area(self):
        print("Computing area...")
        return self.width * self.height

# Usage
rect = Rectangle(5, 10)
print(rect.area)  # Computing area... 50
print(rect.area)  # 50 (cached, no print)

Example 4: Bound Methods (How Methods Work)

Methods are actually descriptors. Here’s how Python implements them:

class MethodDescriptor:
    """Simplified version of how methods work"""
    def __init__(self, func):
        self.func = func
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self.func
        
        # Return a bound method
        def bound_method(*args, **kwargs):
            return self.func(obj, *args, **kwargs)
        
        return bound_method

class Example:
    def method(self):
        return "Hello"

# When you access example.method, Python calls __get__
example = Example()
print(example.method)  # <bound method>
print(example.method())  # Hello

Real-World Descriptor Application: ORM Field

Here’s how Django and SQLAlchemy use descriptors for ORM fields:

class Field:
    """Base descriptor for ORM fields"""
    def __init__(self, column_name=None, required=True):
        self.column_name = column_name
        self.required = required
        self.name = None
    
    def __set_name__(self, owner, name):
        """Called when descriptor is assigned to a class attribute"""
        self.name = name
        if self.column_name is None:
            self.column_name = name
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return obj.__dict__.get(self.name)
    
    def __set__(self, obj, value):
        if self.required and value is None:
            raise ValueError(f"{self.name} is required")
        obj.__dict__[self.name] = value

class StringField(Field):
    def __set__(self, obj, value):
        if value is not None and not isinstance(value, str):
            raise TypeError(f"{self.name} must be a string")
        super().__set__(obj, value)

class IntegerField(Field):
    def __set__(self, obj, value):
        if value is not None and not isinstance(value, int):
            raise TypeError(f"{self.name} must be an integer")
        super().__set__(obj, value)

class Model:
    """Base class for ORM models"""
    def __init__(self, **kwargs):
        for key, value in kwargs.items():
            setattr(self, key, value)

class User(Model):
    username = StringField(required=True)
    email = StringField(required=True)
    age = IntegerField(required=False)

# Usage
user = User(username="alice", email="[email protected]", age=30)
print(user.username)  # alice
user.age = "thirty"   # Raises TypeError

Part 2: Metaclasses

What Are Metaclasses?

A metaclass is a class whose instances are classes. In Python, type is the default metaclassโ€”it’s the class of all classes.

class MyClass:
    pass

print(type(MyClass))  # <class 'type'>
print(type(int))      # <class 'type'>
print(type(str))      # <class 'type'>

# Everything is an instance of type
print(isinstance(MyClass, type))  # True
print(isinstance(int, type))      # True

Understanding type

The type function has two uses:

  1. Get the type of an object: type(obj)
  2. Create a class dynamically: type(name, bases, dict)
# Creating a class dynamically
MyClass = type('MyClass', (), {'x': 10})
instance = MyClass()
print(instance.x)  # 10

# Equivalent to:
class MyClass:
    x = 10

The three arguments to type are:

  • name: The class name (string)
  • bases: Tuple of base classes
  • dict: Dictionary of class attributes and methods
def method(self):
    return "Hello"

MyClass = type('MyClass', (object,), {
    'x': 10,
    'method': method
})

instance = MyClass()
print(instance.x)       # 10
print(instance.method())  # Hello

Creating Custom Metaclasses

A custom metaclass is a class that inherits from type. It can override methods to customize class creation:

class Meta(type):
    def __new__(mcs, name, bases, namespace):
        """Called when a class is created"""
        print(f"Creating class {name}")
        return super().__new__(mcs, name, bases, namespace)
    
    def __init__(cls, name, bases, namespace):
        """Called after the class is created"""
        print(f"Initializing class {name}")
        super().__init__(name, bases, namespace)

class MyClass(metaclass=Meta):
    pass

# Output:
# Creating class MyClass
# Initializing class MyClass

Metaclass Methods

__new__ vs __init__

  • __new__ creates the class object
  • __init__ initializes the class object
class TracingMeta(type):
    def __new__(mcs, name, bases, namespace):
        print(f"__new__: Creating {name}")
        cls = super().__new__(mcs, name, bases, namespace)
        return cls
    
    def __init__(cls, name, bases, namespace):
        print(f"__init__: Initializing {name}")
        super().__init__(name, bases, namespace)

class Example(metaclass=TracingMeta):
    pass

# Output:
# __new__: Creating Example
# __init__: Initializing Example

__call__

The __call__ method is invoked when you instantiate a class:

class SingletonMeta(type):
    _instances = {}
    
    def __call__(cls, *args, **kwargs):
        """Ensure only one instance exists"""
        if cls not in cls._instances:
            instance = super().__call__(*args, **kwargs)
            cls._instances[cls] = instance
        return cls._instances[cls]

class Database(metaclass=SingletonMeta):
    def __init__(self):
        print("Initializing database")
        self.connection = "connected"

# Usage
db1 = Database()  # Initializing database
db2 = Database()  # No output (returns same instance)
print(db1 is db2)  # True

Practical Metaclass Examples

Example 1: Enforcing Class Attributes

class RequiredAttributesMeta(type):
    """Ensure classes define required attributes"""
    required_attrs = []
    
    def __new__(mcs, name, bases, namespace):
        # Skip check for base classes
        if bases:
            for attr in mcs.required_attrs:
                if attr not in namespace:
                    raise TypeError(
                        f"Class {name} must define {attr}"
                    )
        return super().__new__(mcs, name, bases, namespace)

class Plugin(metaclass=RequiredAttributesMeta):
    required_attrs = ['name', 'version', 'execute']

class MyPlugin(Plugin):
    name = "My Plugin"
    version = "1.0"
    
    def execute(self):
        pass

# This would raise TypeError:
# class BadPlugin(Plugin):
#     name = "Bad Plugin"
#     # Missing version and execute

Example 2: Automatic Method Registration

class RegistryMeta(type):
    """Automatically register methods with specific decorator"""
    def __new__(mcs, name, bases, namespace):
        cls = super().__new__(mcs, name, bases, namespace)
        cls._registry = {}
        
        for attr_name, attr_value in namespace.items():
            if hasattr(attr_value, '_registered'):
                cls._registry[attr_value._registered] = attr_value
        
        return cls

def register(key):
    """Decorator to register a method"""
    def decorator(func):
        func._registered = key
        return func
    return decorator

class CommandHandler(metaclass=RegistryMeta):
    @register('help')
    def handle_help(self):
        return "Help command"
    
    @register('status')
    def handle_status(self):
        return "Status command"

# Usage
handler = CommandHandler()
print(CommandHandler._registry)  # {'help': <function>, 'status': <function>}

Example 3: Automatic Property Creation

class AutoPropertyMeta(type):
    """Automatically create properties for private attributes"""
    def __new__(mcs, name, bases, namespace):
        # Find all private attributes and create properties
        new_namespace = {}
        
        for attr_name, attr_value in namespace.items():
            new_namespace[attr_name] = attr_value
            
            # Create property for private attributes
            if attr_name.startswith('_') and not attr_name.startswith('__'):
                public_name = attr_name[1:]  # Remove underscore
                
                # Create getter
                def make_getter(private_name):
                    def getter(self):
                        return getattr(self, private_name)
                    return getter
                
                # Create setter
                def make_setter(private_name):
                    def setter(self, value):
                        setattr(self, private_name, value)
                    return setter
                
                new_namespace[public_name] = property(
                    make_getter(attr_name),
                    make_setter(attr_name)
                )
        
        return super().__new__(mcs, name, bases, new_namespace)

class Person(metaclass=AutoPropertyMeta):
    def __init__(self, name, age):
        self._name = name
        self._age = age

# Usage
person = Person("Alice", 30)
print(person.name)  # Alice (via property)
person.age = 31     # Via property setter

Example 4: Validation Metaclass

class ValidatedMeta(type):
    """Validate class attributes against type hints"""
    def __new__(mcs, name, bases, namespace):
        cls = super().__new__(mcs, name, bases, namespace)
        
        # Store type hints for validation
        cls._validators = {}
        if hasattr(cls, '__annotations__'):
            cls._validators = cls.__annotations__.copy()
        
        return cls
    
    def __call__(cls, *args, **kwargs):
        instance = super().__call__(*args, **kwargs)
        
        # Validate attributes
        for attr_name, expected_type in cls._validators.items():
            if hasattr(instance, attr_name):
                value = getattr(instance, attr_name)
                if not isinstance(value, expected_type):
                    raise TypeError(
                        f"{attr_name} must be {expected_type.__name__}, "
                        f"got {type(value).__name__}"
                    )
        
        return instance

class ValidatedClass(metaclass=ValidatedMeta):
    name: str
    age: int
    
    def __init__(self, name, age):
        self.name = name
        self.age = age

# Usage
obj = ValidatedClass("Alice", 30)  # Valid
obj = ValidatedClass("Bob", "thirty")  # Raises TypeError

Example 5: Singleton Pattern

class SingletonMeta(type):
    """Metaclass for singleton pattern"""
    _instances = {}
    
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super().__call__(*args, **kwargs)
        return cls._instances[cls]

class Logger(metaclass=SingletonMeta):
    def __init__(self):
        self.logs = []
    
    def log(self, message):
        self.logs.append(message)

# Usage
logger1 = Logger()
logger2 = Logger()
print(logger1 is logger2)  # True

logger1.log("Message 1")
print(logger2.logs)  # ["Message 1"]

Real-World Metaclass Application: ORM Model

Here’s how Django models use metaclasses:

class ModelMeta(type):
    """Metaclass for ORM models"""
    def __new__(mcs, name, bases, namespace):
        # Collect fields
        fields = {}
        for key, value in list(namespace.items()):
            if isinstance(value, Field):
                fields[key] = value
                value.name = key
        
        namespace['_fields'] = fields
        cls = super().__new__(mcs, name, bases, namespace)
        return cls

class Field:
    def __init__(self, field_type, required=True):
        self.field_type = field_type
        self.required = required
        self.name = None
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return obj.__dict__.get(self.name)
    
    def __set__(self, obj, value):
        if self.required and value is None:
            raise ValueError(f"{self.name} is required")
        if value is not None and not isinstance(value, self.field_type):
            raise TypeError(f"{self.name} must be {self.field_type.__name__}")
        obj.__dict__[self.name] = value

class Model(metaclass=ModelMeta):
    def __init__(self, **kwargs):
        for key, value in kwargs.items():
            setattr(self, key, value)
    
    def save(self):
        print(f"Saving {self.__class__.__name__}")
        for field_name, field in self._fields.items():
            value = getattr(self, field_name)
            print(f"  {field_name}: {value}")

class User(Model):
    username = Field(str, required=True)
    email = Field(str, required=True)
    age = Field(int, required=False)

# Usage
user = User(username="alice", email="[email protected]", age=30)
user.save()
# Output:
# Saving User
#   username: alice
#   email: [email protected]
#   age: 30

Combining Descriptors and Metaclasses

The most powerful applications combine both concepts:

class ValidatedField:
    """Descriptor for validated fields"""
    def __init__(self, field_type, required=True):
        self.field_type = field_type
        self.required = required
        self.name = None
    
    def __set_name__(self, owner, name):
        self.name = name
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return obj.__dict__.get(self.name)
    
    def __set__(self, obj, value):
        if self.required and value is None:
            raise ValueError(f"{self.name} is required")
        if value is not None and not isinstance(value, self.field_type):
            raise TypeError(
                f"{self.name} must be {self.field_type.__name__}, "
                f"got {type(value).__name__}"
            )
        obj.__dict__[self.name] = value

class ModelMeta(type):
    """Metaclass that collects fields and provides introspection"""
    def __new__(mcs, name, bases, namespace):
        fields = {}
        for key, value in namespace.items():
            if isinstance(value, ValidatedField):
                fields[key] = value
        
        namespace['_fields'] = fields
        cls = super().__new__(mcs, name, bases, namespace)
        return cls

class Model(metaclass=ModelMeta):
    def __init__(self, **kwargs):
        for key, value in kwargs.items():
            setattr(self, key, value)
    
    def to_dict(self):
        """Convert model to dictionary"""
        return {
            name: getattr(self, name)
            for name in self._fields
        }
    
    def validate(self):
        """Validate all fields"""
        for name, field in self._fields.items():
            value = getattr(self, name)
            if field.required and value is None:
                raise ValueError(f"{name} is required")

class Product(Model):
    name = ValidatedField(str, required=True)
    price = ValidatedField((int, float), required=True)
    description = ValidatedField(str, required=False)

# Usage
product = Product(name="Laptop", price=999.99, description="High-end laptop")
print(product.to_dict())
# Output: {'name': 'Laptop', 'price': 999.99, 'description': 'High-end laptop'}

product.validate()  # Passes

When to Use Descriptors and Metaclasses

Use Descriptors When:

  • You need to intercept attribute access
  • You want to validate data on assignment
  • You need computed properties
  • You’re implementing lazy loading
  • You’re building an ORM or validation framework

Use Metaclasses When:

  • You need to customize class creation
  • You want to enforce class structure
  • You need to register classes automatically
  • You’re implementing design patterns like Singleton
  • You’re building frameworks that need introspection

Avoid When:

  • A simple property would suffice
  • You’re trying to be clever without clear benefit
  • The code becomes harder to understand
  • There’s a simpler alternative

Common Pitfalls

Pitfall 1: Overcomplicating with Metaclasses

# โŒ Wrong: Using metaclass when not needed
class Meta(type):
    def __new__(mcs, name, bases, namespace):
        # Unnecessary complexity
        return super().__new__(mcs, name, bases, namespace)

class MyClass(metaclass=Meta):
    pass

# โœ“ Correct: Use only when necessary
class MyClass:
    pass

Pitfall 2: Forgetting __set_name__

# โŒ Wrong: Descriptor doesn't know its name
class Descriptor:
    def __get__(self, obj, objtype=None):
        return obj.__dict__.get('???')  # What's the name?

# โœ“ Correct: Use __set_name__
class Descriptor:
    def __set_name__(self, owner, name):
        self.name = name
    
    def __get__(self, obj, objtype=None):
        return obj.__dict__.get(self.name)

Pitfall 3: Metaclass Conflicts

# โŒ Wrong: Conflicting metaclasses
class Meta1(type):
    pass

class Meta2(type):
    pass

class Base(metaclass=Meta1):
    pass

# This raises TypeError: metaclass conflict
# class Derived(Base, metaclass=Meta2):
#     pass

# โœ“ Correct: Create a unified metaclass
class UnifiedMeta(Meta1, Meta2):
    pass

class Derived(Base, metaclass=UnifiedMeta):
    pass

Conclusion

Descriptors and metaclasses are powerful tools that enable Python’s most sophisticated frameworks:

Descriptors control attribute access through the descriptor protocol. They’re used for validation, lazy loading, computed properties, and implementing framework features like ORM fields.

Metaclasses customize class creation and behavior. They’re used for enforcing class structure, automatic registration, design patterns, and framework introspection.

Understanding these concepts transforms your ability to:

  • Read and understand framework code
  • Build your own abstractions and frameworks
  • Debug complex attribute access issues
  • Implement sophisticated design patterns

Start with descriptorsโ€”they’re more commonly needed and easier to understand. Use metaclasses only when you need to customize class creation itself. Remember: with great power comes great responsibility. Use these features judiciously, and always prioritize code clarity over cleverness.

Comments