Most Python developers use Python descriptors every single day without knowing it. Every time you slap a @property on a method, every time you call a method on an object, every time Django magically turns user.email into a database query -- descriptors are doing the work. They are the invisible plumbing underneath almost everything interesting in the language.
I had one of those moments a few years back where I was debugging a Django model and stepped into the source code of . I expected to find some complicated metaclass sorcery. Instead, I found a handful of objects implementing . That's it. The entire ORM field system is built on a protocol that fits on an index card. Once I understood it, half of Python's "magic" stopped being magic.
CharField
__get__
Let me show you what I mean.
The Protocol Behind Every @property
Here's the deal. A descriptor is any object that defines at least one of these methods:
__get__(self, obj, objtype=None) # called on attribute access
__set__(self, obj, value) # called on attribute assignment
__delete__(self, obj) # called on attribute deletion
That's the whole descriptor protocol. Three methods. If your object has any of them and lives as a class attribute, Python will call those methods instead of doing normal attribute lookup. This is not some obscure corner of the language. This is the mechanism that makes @property, @staticmethod, @classmethod, and even regular method calls work.
Let's start with the thing everyone already uses. When you write:
The @property decorator creates a descriptor object. It's not doing anything clever. Here's what property looks like if you implement it from scratch in pure Python:
class Property:
def __init__(self, fget=None, fset=None, fdel=None, doc=None):
self.fget = fget
self.fset = fset
self.fdel = fdel
if doc is None and fget is not None:
doc = fget.__doc__
self.__doc__ = doc
def __set_name__(self, owner, name):
self.__name__ = name
def __get__(self, obj, objtype=None):
if obj is None:
return self
if self.fget is None:
raise AttributeError("unreadable attribute")
return self.fget(obj)
def __set__(self, obj, value):
if self.fset is None:
raise AttributeError("can't set attribute")
self.fset(obj, value)
def __delete__(self, obj):
if self.fdel is None:
raise AttributeError("can't delete attribute")
self.fdel(obj)
def setter(self, fset):
return type(self)(self.fget, fset, self.fdel, self.__doc__)
def deleter(self, fdel):
return type(self)(self.fget, self.fset, fdel, self.__doc__)
That's essentially the CPython implementation translated to Python. The __get__ method just calls self.fget(obj) -- i.e., it calls the function you decorated with @property, passing the instance as the first argument. The __set__ method calls self.fset(obj, value). No magic, just indirection.
The subtle thing is if obj is None: return self. When you access a descriptor through the class (Circle.area instead of circle.area), obj is None, and the descriptor returns itself. This is why Circle.area gives you the property object and circle.area gives you the computed value.
One more hidden gem: __set_name__. Added in Python 3.6, this gets called automatically when the class is created, telling the descriptor what name it was assigned to. Before this existed, you had to pass the name manually or do metaclass tricks. Now the descriptor just knows.
Data vs Non-Data Descriptors: The Lookup Chain
This is where things get genuinely interesting, and where most people's understanding breaks down.
Python splits descriptors into two categories. A data descriptor defines __set__ or __delete__ (in addition to __get__). A non-data descriptor only defines __get__. The difference determines where your descriptor sits in Python's attribute lookup chain.
Here's the actual priority order when you access obj.name:
Data descriptors on the class (highest priority)
Instance __dict__ entries
Non-data descriptors on the class
Plain class variables
__getattr__() if defined (lowest priority)
This is not an abstraction. This is literally what object.__getattribute__ does. Here's the pure Python equivalent:
def object_getattribute(obj, name):
null = object()
objtype = type(obj)
cls_var = find_name_in_mro(objtype, name, null)
descr_get = getattr(type(cls_var), '__get__', null)
if descr_get is not null:
if (hasattr(type(cls_var), '__set__')
or hasattr(type(cls_var), '__delete__')):
# Data descriptor -- highest priority
return descr_get(cls_var, obj, objtype)
if hasattr(obj, '__dict__') and name in vars(obj):
# Instance dictionary -- second priority
return vars(obj)[name]
if descr_get is not null:
# Non-data descriptor -- third priority
return descr_get(cls_var, obj, objtype)
if cls_var is not null:
return cls_var
raise AttributeError(name)
Read that carefully. Data descriptors win over the instance __dict__. Non-data descriptors lose to it. This single distinction is the key to understanding why certain patterns work.
Here's a concrete example. Functions in Python are non-data descriptors -- they define __get__ but not __set__. That means you can shadow a method by assigning to the instance dict:
class Dog:
def speak(self):
return "woof"
rex = Dog()
rex.speak = lambda: "meow" # shadows the method
rex.speak() # "meow" -- instance dict wins over non-data descriptor
But @property creates a data descriptor (it has __set__), so you cannot shadow it:
class Dog:
@property
def speak(self):
return "woof"
rex = Dog()
rex.speak = "meow" # AttributeError! Data descriptor wins.
This is not arbitrary. It's a deeply intentional design. Data descriptors enforce controlled access. Non-data descriptors allow caching. And this distinction is exactly what makes Django's ORM fields possible.
Reverse-Engineering Django ORM Fields
Let's look at how Django uses this. When you write a Django model:
from django.db import models
class Book(models.Model):
title = models.CharField(max_length=255)
pages = models.IntegerField()
What actually happens during class creation? Django's metaclass (ModelBase) iterates over the class attributes. For each one that has a contribute_to_class method -- which all Field instances do -- it calls that method. Inside contribute_to_class, the Field stores itself in the model's _meta options object and then installs a DeferredAttribute descriptor on the class.
So after class creation, Book.title is no longer a CharField instance. It's a DeferredAttribute -- a descriptor. The original CharField is tucked away in Book._meta.fields for metadata access.
Here's the actual DeferredAttribute from Django's source:
class DeferredAttribute:
def __init__(self, field):
self.field = field
def __get__(self, instance, cls=None):
if instance is None:
return self
data = instance.__dict__
field_name = self.field.attname
if field_name not in data:
val = self._check_parent_chain(instance)
if val is None:
instance.refresh_from_db(fields=[field_name])
else:
data[field_name] = val
return data[field_name]
Notice what's happening here. DeferredAttribute only implements __get__. It is a non-data descriptor. This is the critical design decision. Because it's a non-data descriptor, instance dictionary entries take priority over it.
When Django creates a Book instance from a query result, it writes the field values directly into instance.__dict__. Now when you access book.title, Python checks: is there a data descriptor? No, DeferredAttribute only has __get__. Is there an instance dict entry? Yes. Return that. The descriptor's __get__ is never even called. Zero overhead for normal attribute access.
But when a field is deferred (via .defer() or .only()), Django skips putting it in __dict__. Now when you access the attribute, the instance dict lookup misses, Python falls through to the non-data descriptor, and DeferredAttribute.__get__ fires. It calls refresh_from_db(), fetches the value, stores it in __dict__ for next time, and returns it.
This is beautiful engineering. By choosing a non-data descriptor, Django gets:
Zero overhead on normal access -- the descriptor is bypassed entirely
Lazy loading for deferred fields -- the descriptor catches the miss
Simple assignment -- book.title = "New Title" just writes to __dict__, no __set__ needed
Building Your Own ORM Field from Scratch
Let's put this all together and build a mini ORM field system. This is the best way to internalize descriptors -- build something real with them.
class Field:
"""Base field descriptor that handles type coercion and validation."""
def __init__(self, field_type, default=None, required=True):
self.field_type = field_type
self.default = default
self.required = required
self.name = None
def __set_name__(self, owner, name):
self.name = name
if not hasattr(owner, '_fields'):
owner._fields = {}
owner._fields[name] = self
def __get__(self, obj, objtype=None):
if obj is None:
return self
val = obj.__dict__.get(self.name)
if val is None and self.name not in obj.__dict__:
if self.default is not None:
return self.default() if callable(self.default) else self.default
if self.required:
raise AttributeError(
f"Field '{self.name}' has no value and no default"
)
return None
return val
def __set__(self, obj, value):
if value is not None:
try:
value = self.field_type(value)
except (TypeError, ValueError) as e:
raise TypeError(
f"Cannot assign {type(value).__name__} to "
f"{self.field_type.__name__} field '{self.name}': {e}"
)
elif self.required:
raise ValueError(f"Field '{self.name}' cannot be None")
obj.__dict__[self.name] = value
class CharField(Field):
def __init__(self, max_length=255, **kwargs):
super().__init__(str, **kwargs)
self.max_length = max_length
def __set__(self, obj, value):
super().__set__(obj, value)
if value and len(value) > self.max_length:
raise ValueError(
f"'{self.name}' exceeds max_length of {self.max_length}"
)
class IntegerField(Field):
def __init__(self, min_value=None, max_value=None, **kwargs):
super().__init__(int, **kwargs)
self.min_value = min_value
self.max_value = max_value
def __set__(self, obj, value):
super().__set__(obj, value)
if value is not None:
if self.min_value is not None and value < self.min_value:
raise ValueError(f"'{self.name}' must be >= {self.min_value}")
if self.max_value is not None and value > self.max_value:
raise ValueError(f"'{self.name}' must be <= {self.max_value}")
class Model:
def __init__(self, **kwargs):
for name, field in self.__class__._fields.items():
if name in kwargs:
setattr(self, name, kwargs[name])
def __repr__(self):
fields = ', '.join(
f'{name}={getattr(self, name, "<?>")}'
for name in self.__class__._fields
)
return f"{self.__class__.__name__}({fields})"
# --- Usage ---
class Book(Model):
title = CharField(max_length=100)
pages = IntegerField(min_value=1, max_value=10000)
rating = IntegerField(min_value=1, max_value=5, required=False, default=None)
book = Book(title="Attention Is All You Need", pages=15)
print(book) # Book(title=Attention Is All You Need, pages=15, rating=None)
book.pages = 16 # works fine
book.pages = "256" # coerced to int via int("256")
try:
book.title = "A" * 200 # ValueError: 'title' exceeds max_length of 100
except ValueError as e:
print(e)
Notice the design decisions. Our Field is a data descriptor (has both __get__ and __set__). This is different from Django's DeferredAttribute, and it's a deliberate choice: we want validation on every write, so we need __set__ to intercept assignments. Django prioritizes performance for the common case and uses a non-data descriptor because it doesn't need write interception for basic fields.
Descriptor Patterns You Should Know
Beyond the basics, there are a few descriptor patterns that come up repeatedly in production Python code.
Lazy/Cached Properties. This is the classic non-data descriptor trick. Compute a value once, then store it in the instance dict so the descriptor is never called again:
class cached_property:
"""Non-data descriptor that replaces itself with computed value."""
def __init__(self, func):
self.func = func
self.attrname = func.__name__
self.__doc__ = func.__doc__
def __get__(self, obj, objtype=None):
if obj is None:
return self
val = self.func(obj)
obj.__dict__[self.attrname] = val # shadows the descriptor
return val
class Dataset:
def __init__(self, path):
self.path = path
@cached_property
def data(self):
print("Loading dataset...") # only prints once
return open(self.path).read()
Python 3.8 added functools.cached_property which does exactly this. The trick works because a non-data descriptor loses to instance dict entries. After the first access, the computed value sits in __dict__ and the descriptor is bypassed forever. Elegantly lazy.
Type-Checked Attributes. Reusable validation that keeps your model classes clean:
class Typed:
def __init__(self, expected_type):
self.expected_type = expected_type
def __set_name__(self, owner, name):
self.name = name
def __get__(self, obj, objtype=None):
if obj is None:
return self
return obj.__dict__.get(self.name)
def __set__(self, obj, value):
if not isinstance(value, self.expected_type):
raise TypeError(
f"Expected {self.expected_type.__name__}, "
f"got {type(value).__name__}"
)
obj.__dict__[self.name] = value
class Config:
host = Typed(str)
port = Typed(int)
debug = Typed(bool)
This is fundamentally what libraries like attrs and pydantic do under the hood, just with more features bolted on.
When Descriptors Beat Alternatives
So when should you actually reach for a custom descriptor?
Use descriptors when you need reusable attribute logic across multiple classes. If you find yourself writing the same @property getter/setter pattern on five different classes, that's a descriptor waiting to happen. Write it once, use it as a class variable everywhere.
Use descriptors when you need to intercept attribute access transparently. The beauty of descriptors is that client code looks exactly like normal attribute access. book.title works whether there's a descriptor, a dict entry, or a class variable behind it. The protocol is invisible to callers.
Use non-data descriptors for caching and lazy evaluation. The instance dict shadowing trick is one of the cleanest patterns in Python. No cache invalidation logic, no wrapper overhead after first access.
Don't use descriptors for simple validation on a single class. If you just need to validate one attribute on one class, @property is fine. Descriptors shine when the logic is reusable.
The descriptor protocol is one of those things that, once you see it, you start seeing everywhere. Functions becoming bound methods? Descriptor. @staticmethod skipping self? Descriptor. @classmethod passing the class? Descriptor. SQLAlchemy column access? Descriptor. Django model fields? Descriptor. Python's entire attribute access system is built on this three-method protocol.
It's a small API with enormous leverage, and that's my favorite kind of abstraction -- the kind where a simple contract produces complex, useful behavior. The same reason attention mechanisms are interesting, really. A small operation, applied systematically, gives you something much greater than the sum of its parts.