We started our project with raw SQL queries across several microservices, executed directly with psycopg. They were grouped by topic and well organized for the team to easily understand and collaborate.
The intention was to keep the codebase lean and in control.
But as the project progressed and we built more services, we noticed that almost 50% of our queries were becoming redundant.
That's when we decided to abstract the database layer into an ORM.
We had a few principles we followed for good coding hygiene:
We were using serverless infra for the cloud, and fast boot time was an important requirement.
Our servers were hosted in Fly.io machines that promise millisecond boot-up time from 0 to 1. If our app was also fast enough to boot up, we could have a really cost-efficient cloud architecture.
This motive led us to keep our codebase clean and lean and write only what's necessary. We've preferred custom libraries over external plugins so that we can keep only what we need.
SQLAlchemy 2.x is ~50+ dependencies deep (it's massive).
It can be tempting to write a very robust, advanced coding solution using the latest libraries out there, which looks cool at first. But as we progress, the codebase can become overwhelming and hard to understand, especially for our new recruits.
So the principle was to write the bare minimum that covers the core use case, addresses the edge cases, and is easily testable.
Bringing in an ORM can contradict this principle.
For example, compare this advanced relationship mapping written with SQLAlchemy vs raw SQL:
# SQLAlchemy - Complex relationship mapping
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
company_id = Column(Integer, ForeignKey('companies.id'))
company = relationship("Company", back_populates="users")
posts = relationship("Post", back_populates="author",
lazy="dynamic", cascade="all, delete-orphan")
# Query with eager loading
users = session.query(User)\
.options(joinedload(User.company))\
.options(selectinload(User.posts))\
.filter(User.company_id == company_id)\
.all()
-- Raw SQL - Same query, crystal clear
SELECT u.*, c.company_name
FROM users u
LEFT JOIN companies c ON u.company_id = c.id
WHERE u.company_id = 'comp_123';
-- Fetch posts separately if needed
SELECT * FROM posts WHERE user_id = ANY($1);
For a generic backend developer, the raw SQL can be understood in less than 2 minutes, whereas the Alchemy code can take some intense cognitive effort and time.
To prevent this, we could just ban the use of complex joins in the codebase. But that would contradict our first point.
To keep the data contract between services and the user, we've extensively used Pydantic, which offers really decent IDE support for type checking.
So clean interoperability between the table models and Pydantic models was important.
SQLAlchemy didn't natively support Pydantic, and the workaround felt wacky.
So we tried an alternative called SQLModel, which is a wrapper around SQLAlchemy with better Pydantic support.
It seemed like the best available choice for us, and we used it for a while until we hit these walls:
with Session(engine) as session: for every queryOptional[int] primary keys confuse type checkersThere were other alternatives out there, like TortoiseORM and Peewee.
In a hyper-fast shipping environment, these decisions can consume a lot of time and energy that doesn't move the business forward (at least for the stakeholders), and frequent migrations would be a nightmare to carry out while shipping features. So we couldn't risk another experiment with a different package.
And the next best move in front of us was to ditch everyone and take full control.
And that was the best decision we made.
I researched the SQLAlchemy codebase to understand how these systems have been built. Surprisingly, it's pretty straightforward unless we bring in the complex query relations.
The entire ORM lives in a single sub-1,000-line file. Here's how it works:
Instead of fighting with Pydantic compatibility, we embraced it. Each column is defined using Pydantic's Field() with custom metadata:
class User(BaseTableModel):
user_id: str = Column(primary_key=True)
email: str = Column(unique=True, index=True)
company_id: str = Column(
foreign_key_table="company",
foreign_key_column="company_id",
on_delete=OnDeleteFkEnum.CASCADE
)
created_at: datetime = Column(
is_timezone_aware=True,
db_default="CURRENT_TIMESTAMP"
)
The Column() function is just a wrapper around Pydantic's Field() that attaches a ColumnMetadata object. This gives us:
Optional[int] primary key confusionWe use Python's type system to generate PostgreSQL DDL:
@classmethod
def generate_ddl_query(cls) -> str:
"""Introspect Pydantic fields to build CREATE TABLE"""
for name, field_type in cls.__annotations__.items():
metadata = get_column_metadata(name)
db_type = cls.get_db_type(field_type, metadata)
# Build: user_id TEXT PRIMARY KEY NOT NULL
# Or: tags TEXT[] (for List[str])
# Or: metadata JSONB (for Dict)
The magic: get_db_type() handles Python → PostgreSQL mapping:
List[str] → TEXT[]List[Dict] → JSONBdatetime → TIMESTAMP or TIMESTAMPTZ (based on metadata)One class definition produces the entire schema. No separate migration files.
For reads, we generate parameterized SQL and map results back to Pydantic models:
users = User.select_many(
and_condition_columns=["company_id", "is_active"],
and_condition_value=["comp_123", True],
order_by_columns=["created_at"],
limit=10
)
# Returns: List[User] - fully typed, validated Pydantic objects
Internally this:
SELECT * FROM user WHERE company_id = %s AND is_active = %s ORDER BY created_at LIMIT 10User(**dict(row)) - Pydantic handles validationFor writes, models update themselves:
user = User.select_one(
and_condition_columns=["user_id"],
and_condition_value=["u_123"]
)
user.email = "[email protected]"
user.update() # Generates UPDATE user SET email = %s WHERE user_id = %s
No session management. No with Session(engine) as session: ceremony. Just call the method.
custom_condition_query for complex WHERE clausesYou can see the full implementation here - it's surprisingly readable because there's no magic, just straightforward Python introspection and string building.
Building our own ORM taught us that ORMs aren't magic - they're just SQL string builders + object mapping. Most apps don't need advanced relationship mapping or dialect abstraction.
If you're in a similar situation, ask yourself: "Do I need an ORM's complexity, or just its convenience?"
Sometimes the answer is neither. Sometimes 900 lines of focused code beats 50 dependencies.