• HOME
  • ACADEMIC WORKS
  • OPEN SOURCE
  • EXPLORE
Return to Codex

// 13 · II · 2026 </>

The Art of Python Descriptors: How Django ORM Actually Works Under the Hood

February 13, 2026
2,150 words
11 min read
·python·advanced-python·django·design-patterns

Most Python developers use Python descriptors every single day without knowing it. Every time you slap a @property on a method, every time you call a method on an object, every time Django magically turns user.email into a database query -- descriptors are doing the work. They are the invisible plumbing underneath almost everything interesting in the language.

I had one of those moments a few years back where I was debugging a Django model and stepped into the source code of CharField. I expected to find some complicated metaclass sorcery. Instead, I found a handful of objects implementing __get__. That's it. The entire ORM field system is built on a protocol that fits on an index card. Once I understood it, half of Python's "magic" stopped being magic.

Let me show you what I mean.

The Protocol Behind Every @property

Here's the deal. A descriptor is any object that defines at least one of these methods:

Codex
__get__(self, obj, objtype=None)   # called on attribute access
__set__(self, obj, value)          # called on attribute assignment
__delete__(self, obj)              # called on attribute deletion

That's the whole descriptor protocol. Three methods. If your object has any of them and lives as a class attribute, Python will call those methods instead of doing normal attribute lookup. This is not some obscure corner of the language. This is the mechanism that makes @property, @staticmethod, @classmethod, and even regular method calls work.

Let's start with the thing everyone already uses. When you write:

Codex
class Circle:
    def __init__(self, radius):
        self._radius = radius

    @property
    def area(self):
        return 3.14159 * self._radius ** 2

The @property decorator creates a descriptor object. It's not doing anything clever. Here's what property looks like if you implement it from scratch in pure Python:

Codex
class Property:
    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        if doc is None and fget is not None:
            doc = fget.__doc__
        self.__doc__ = doc

    def __set_name__(self, owner, name):
        self.__name__ = name

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError("unreadable attribute")
        return self.fget(obj)

    def __set__(self, obj, value):
        if self.fset is None:
            raise AttributeError("can't set attribute")
        self.fset(obj, value)

    def __delete__(self, obj):
        if self.fdel is None:
            raise AttributeError("can't delete attribute")
        self.fdel(obj)

    def setter(self, fset):
        return type(self)(self.fget, fset, self.fdel, self.__doc__)

    def deleter(self, fdel):
        return type(self)(self.fget, self.fset, fdel, self.__doc__)

That's essentially the CPython implementation translated to Python. The __get__ method just calls self.fget(obj) -- i.e., it calls the function you decorated with @property, passing the instance as the first argument. The __set__ method calls self.fset(obj, value). No magic, just indirection.

The subtle thing is if obj is None: return self. When you access a descriptor through the class (Circle.area instead of circle.area), obj is None, and the descriptor returns itself. This is why Circle.area gives you the property object and circle.area gives you the computed value.

One more hidden gem: __set_name__. Added in Python 3.6, this gets called automatically when the class is created, telling the descriptor what name it was assigned to. Before this existed, you had to pass the name manually or do metaclass tricks. Now the descriptor just knows.

Data vs Non-Data Descriptors: The Lookup Chain

This is where things get genuinely interesting, and where most people's understanding breaks down.

Python splits descriptors into two categories. A data descriptor defines __set__ or __delete__ (in addition to __get__). A non-data descriptor only defines __get__. The difference determines where your descriptor sits in Python's attribute lookup chain.

Here's the actual priority order when you access obj.name:

  1. Data descriptors on the class (highest priority)
  2. Instance __dict__ entries
  3. Non-data descriptors on the class
  4. Plain class variables
  5. __getattr__() if defined (lowest priority)

This is not an abstraction. This is literally what object.__getattribute__ does. Here's the pure Python equivalent:

Codex
def object_getattribute(obj, name):
    null = object()
    objtype = type(obj)
    cls_var = find_name_in_mro(objtype, name, null)
    descr_get = getattr(type(cls_var), '__get__', null)

    if descr_get is not null:
        if (hasattr(type(cls_var), '__set__')
            or hasattr(type(cls_var), '__delete__')):
            # Data descriptor -- highest priority
            return descr_get(cls_var, obj, objtype)

    if hasattr(obj, '__dict__') and name in vars(obj):
        # Instance dictionary -- second priority
        return vars(obj)[name]

    if descr_get is not null:
        # Non-data descriptor -- third priority
        return descr_get(cls_var, obj, objtype)

    if cls_var is not null:
        return cls_var

    raise AttributeError(name)

Read that carefully. Data descriptors win over the instance __dict__. Non-data descriptors lose to it. This single distinction is the key to understanding why certain patterns work.

Here's a concrete example. Functions in Python are non-data descriptors -- they define __get__ but not __set__. That means you can shadow a method by assigning to the instance dict:

Codex
class Dog:
    def speak(self):
        return "woof"

rex = Dog()
rex.speak = lambda: "meow"  # shadows the method
rex.speak()  # "meow" -- instance dict wins over non-data descriptor

But @property creates a data descriptor (it has __set__), so you cannot shadow it:

Codex
class Dog:
    @property
    def speak(self):
        return "woof"

rex = Dog()
rex.speak = "meow"  # AttributeError! Data descriptor wins.

This is not arbitrary. It's a deeply intentional design. Data descriptors enforce controlled access. Non-data descriptors allow caching. And this distinction is exactly what makes Django's ORM fields possible.

Reverse-Engineering Django ORM Fields

Let's look at how Django uses this. When you write a Django model:

Codex
from django.db import models

class Book(models.Model):
    title = models.CharField(max_length=255)
    pages = models.IntegerField()

What actually happens during class creation? Django's metaclass (ModelBase) iterates over the class attributes. For each one that has a contribute_to_class method -- which all Field instances do -- it calls that method. Inside contribute_to_class, the Field stores itself in the model's _meta options object and then installs a DeferredAttribute descriptor on the class.

So after class creation, Book.title is no longer a CharField instance. It's a DeferredAttribute -- a descriptor. The original CharField is tucked away in Book._meta.fields for metadata access.

Here's the actual DeferredAttribute from Django's source:

Codex
class DeferredAttribute:
    def __init__(self, field):
        self.field = field

    def __get__(self, instance, cls=None):
        if instance is None:
            return self
        data = instance.__dict__
        field_name = self.field.attname
        if field_name not in data:
            val = self._check_parent_chain(instance)
            if val is None:
                instance.refresh_from_db(fields=[field_name])
            else:
                data[field_name] = val
        return data[field_name]

Notice what's happening here. DeferredAttribute only implements __get__. It is a non-data descriptor. This is the critical design decision. Because it's a non-data descriptor, instance dictionary entries take priority over it.

When Django creates a Book instance from a query result, it writes the field values directly into instance.__dict__. Now when you access book.title, Python checks: is there a data descriptor? No, DeferredAttribute only has __get__. Is there an instance dict entry? Yes. Return that. The descriptor's __get__ is never even called. Zero overhead for normal attribute access.

But when a field is deferred (via .defer() or .only()), Django skips putting it in __dict__. Now when you access the attribute, the instance dict lookup misses, Python falls through to the non-data descriptor, and DeferredAttribute.__get__ fires. It calls refresh_from_db(), fetches the value, stores it in __dict__ for next time, and returns it.

This is beautiful engineering. By choosing a non-data descriptor, Django gets:

  • Zero overhead on normal access -- the descriptor is bypassed entirely
  • Lazy loading for deferred fields -- the descriptor catches the miss
  • Simple assignment -- book.title = "New Title" just writes to __dict__, no __set__ needed

Building Your Own ORM Field from Scratch

Let's put this all together and build a mini ORM field system. This is the best way to internalize descriptors -- build something real with them.

Codex
class Field:
    """Base field descriptor that handles type coercion and validation."""

    def __init__(self, field_type, default=None, required=True):
        self.field_type = field_type
        self.default = default
        self.required = required
        self.name = None

    def __set_name__(self, owner, name):
        self.name = name
        if not hasattr(owner, '_fields'):
            owner._fields = {}
        owner._fields[name] = self

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        val = obj.__dict__.get(self.name)
        if val is None and self.name not in obj.__dict__:
            if self.default is not None:
                return self.default() if callable(self.default) else self.default
            if self.required:
                raise AttributeError(
                    f"Field '{self.name}' has no value and no default"
                )
            return None
        return val

    def __set__(self, obj, value):
        if value is not None:
            try:
                value = self.field_type(value)
            except (TypeError, ValueError) as e:
                raise TypeError(
                    f"Cannot assign {type(value).__name__} to "
                    f"{self.field_type.__name__} field '{self.name}': {e}"
                )
        elif self.required:
            raise ValueError(f"Field '{self.name}' cannot be None")
        obj.__dict__[self.name] = value


class CharField(Field):
    def __init__(self, max_length=255, **kwargs):
        super().__init__(str, **kwargs)
        self.max_length = max_length

    def __set__(self, obj, value):
        super().__set__(obj, value)
        if value and len(value) > self.max_length:
            raise ValueError(
                f"'{self.name}' exceeds max_length of {self.max_length}"
            )


class IntegerField(Field):
    def __init__(self, min_value=None, max_value=None, **kwargs):
        super().__init__(int, **kwargs)
        self.min_value = min_value
        self.max_value = max_value

    def __set__(self, obj, value):
        super().__set__(obj, value)
        if value is not None:
            if self.min_value is not None and value < self.min_value:
                raise ValueError(f"'{self.name}' must be >= {self.min_value}")
            if self.max_value is not None and value > self.max_value:
                raise ValueError(f"'{self.name}' must be <= {self.max_value}")


class Model:
    def __init__(self, **kwargs):
        for name, field in self.__class__._fields.items():
            if name in kwargs:
                setattr(self, name, kwargs[name])

    def __repr__(self):
        fields = ', '.join(
            f'{name}={getattr(self, name, "<?>")}'
            for name in self.__class__._fields
        )
        return f"{self.__class__.__name__}({fields})"


# --- Usage ---
class Book(Model):
    title = CharField(max_length=100)
    pages = IntegerField(min_value=1, max_value=10000)
    rating = IntegerField(min_value=1, max_value=5, required=False, default=None)


book = Book(title="Attention Is All You Need", pages=15)
print(book)  # Book(title=Attention Is All You Need, pages=15, rating=None)

book.pages = 16       # works fine
book.pages = "256"    # coerced to int via int("256")

try:
    book.title = "A" * 200  # ValueError: 'title' exceeds max_length of 100
except ValueError as e:
    print(e)

Notice the design decisions. Our Field is a data descriptor (has both __get__ and __set__). This is different from Django's DeferredAttribute, and it's a deliberate choice: we want validation on every write, so we need __set__ to intercept assignments. Django prioritizes performance for the common case and uses a non-data descriptor because it doesn't need write interception for basic fields.

Descriptor Patterns You Should Know

Beyond the basics, there are a few descriptor patterns that come up repeatedly in production Python code.

Lazy/Cached Properties. This is the classic non-data descriptor trick. Compute a value once, then store it in the instance dict so the descriptor is never called again:

Codex
class cached_property:
    """Non-data descriptor that replaces itself with computed value."""

    def __init__(self, func):
        self.func = func
        self.attrname = func.__name__
        self.__doc__ = func.__doc__

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        val = self.func(obj)
        obj.__dict__[self.attrname] = val  # shadows the descriptor
        return val


class Dataset:
    def __init__(self, path):
        self.path = path

    @cached_property
    def data(self):
        print("Loading dataset...")  # only prints once
        return open(self.path).read()

Python 3.8 added functools.cached_property which does exactly this. The trick works because a non-data descriptor loses to instance dict entries. After the first access, the computed value sits in __dict__ and the descriptor is bypassed forever. Elegantly lazy.

Type-Checked Attributes. Reusable validation that keeps your model classes clean:

Codex
class Typed:
    def __init__(self, expected_type):
        self.expected_type = expected_type

    def __set_name__(self, owner, name):
        self.name = name

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return obj.__dict__.get(self.name)

    def __set__(self, obj, value):
        if not isinstance(value, self.expected_type):
            raise TypeError(
                f"Expected {self.expected_type.__name__}, "
                f"got {type(value).__name__}"
            )
        obj.__dict__[self.name] = value


class Config:
    host = Typed(str)
    port = Typed(int)
    debug = Typed(bool)

This is fundamentally what libraries like attrs and pydantic do under the hood, just with more features bolted on.

When Descriptors Beat Alternatives

So when should you actually reach for a custom descriptor?

Use descriptors when you need reusable attribute logic across multiple classes. If you find yourself writing the same @property getter/setter pattern on five different classes, that's a descriptor waiting to happen. Write it once, use it as a class variable everywhere.

Use descriptors when you need to intercept attribute access transparently. The beauty of descriptors is that client code looks exactly like normal attribute access. book.title works whether there's a descriptor, a dict entry, or a class variable behind it. The protocol is invisible to callers.

Use non-data descriptors for caching and lazy evaluation. The instance dict shadowing trick is one of the cleanest patterns in Python. No cache invalidation logic, no wrapper overhead after first access.

Don't use descriptors for simple validation on a single class. If you just need to validate one attribute on one class, @property is fine. Descriptors shine when the logic is reusable.

The descriptor protocol is one of those things that, once you see it, you start seeing everywhere. Functions becoming bound methods? Descriptor. @staticmethod skipping self? Descriptor. @classmethod passing the class? Descriptor. SQLAlchemy column access? Descriptor. Django model fields? Descriptor. Python's entire attribute access system is built on this three-method protocol.

It's a small API with enormous leverage, and that's my favorite kind of abstraction -- the kind where a simple contract produces complex, useful behavior. The same reason attention mechanisms are interesting, really. A small operation, applied systematically, gives you something much greater than the sum of its parts.

Finis

U

DISCIPLINE.FOCUS.IMPACT.

LET'S CONNECT

© 2026 Umutcan Edizaslan. All rights reserved.

Further Reading

  • 15V

    Transformer Aslında Ne Yapıyor? Attention, RoPE ve Mimarinin İçine Çıplak Gözle Bakmak (2026)

    ·transformer·attention·rope
    Read→
  • 12V

    LLM Nedir? Sıfırdan Başlayanlar İçin Sözlük

    ·llm·yapay-zeka·ai
    Read→
  • 13II

    Writing High-Performance Python in 2026: When Cython, Numba, Mojo, and C Extensions Make Sense

    ·python·performance·cython
    Read→