Introduction

Usually with SQLAlchemy ORM, you define your model classes and let SQLAlchemy create the tables for you or by a migration tool such as Alembic. The attributes in your model class will be used to create the table and columns with appropriate data types. But what if the tables are already created and you want to use SQLAlchemy ORM? I had this situation where I had the tables created by Django application and another application needed to access data for analyzing from those tables. I didn’t want to add Django as a dependency in my analysis package just to use Django ORM.

Solution

After a bit of research I found two approaches:

  1. Define model class with appropriate columns for each table

    This approach works but is a bit tedious. I already implemented ORM mapping in Django and I didn’t want to do that again in SQLAlchemy. Also, if table structure changes, then I need to remember to update in both places. But it has its benefits. First we get intellisense while developing. Second, we don’t need to define all columns if we are only going to work with subset of columns. Note that if you plan to write to the database, then you’ll need to define all ‘non-null’ columns.

  2. Let SQLAlchemy create a class automatically by inspecting the tables

    Second option is to let SQLAlchemy figure the table structure automatically. An extension called automap can generate mapped classes and relationships. The downside to this approach is that we don’t get any intellisense on class attributes during development. It didn’t matter for me that much because I only needed to work with few tables and I knew their structure well but working with many tables with many columns might be a bit difficult and can be error prone. Check the code below on how to generate model classes automatically.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import sqlalchemy
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import sessionmaker

from config import POSTGRES_URI

engine = sqlalchemy.create_engine(POSTGRES_URI, pool_pre_ping=True)
Session = sessionmaker(bind=engine)

# these two lines perform the "database reflection" to analyze tables and relationships
Base = automap_base()
Base.prepare(engine, reflect=True)

# there are many tables in the database but I want `products` and `categories`
# only so I can leave others out
Product = Base.classes.products
Category = Base.classes.categories

# for debugging and passing the query results around
# I usually add as_dict method on the classes
def as_dict(obj):
    data = obj.__dict__
    data.pop('_sa_instance_state')
    return data

# add the `as_dict` function to the classes
for c in [Product, Category]:
    c.as_dict = as_dict

Now we can use the mapped classes Product and Category as any other SQLAlchemy model classes.

1
2
3
4
5
6
q = session.query(Product.id, Product.title)
q = q.filter(Product.price > 100)

# e.g return this as a rest-api response
products = [r.as_dict() for r in q]

For more information about automap extension see Automap

Comments