Intro
Recursive query is a powerful feature that allows us to query hierarchical data. Let’s consider an example to store product taxonomy for an e-commerce database. Let’s assume we have a table called categories
that contains columns id, name, level, parent_id
, where level
indicates the depth of this cateogry and parent_id
is a reference to another entry in categories
table that is a parent of this entry (self-referencing).
Below is an example data.
id | name | level | parent_id |
---|---|---|---|
128 | Clothing And Accessories | 0 | |
89 | Boys | 1 | 128 |
94 | Girls | 1 | 128 |
70 | Men | 1 | 128 |
60 | Women | 1 | 128 |
103 | Boys Clothing | 2 | 89 |
59 | Men Clothing | 2 | 70 |
114 | Men Footwear | 2 | 70 |
119 | Women Footwear | 2 | 60 |
337 | Boys Clothing Accessories | 2 | 89 |
100 | Boys Footwear | 2 | 89 |
376 | Girls Clothing Accessories | 2 | 94 |
399 | Girls Footwear | 2 | 94 |
We have Clothing And Accessories as our top-level category which has no parent i.e parent_id
is NULL
and its level
is 0.
Let’s say we wanted to find out all categories under Clothing and Accessories. Getting direct descendents is easy. We just filter on parent_id
1
SELECT * FROM categories WHERE parent_id = 128
But we want to get everything under the given category. Without recursive query we would have to write this logic in our code but thankfully we can do it using SQL query only.
Recursive SQL
Let’s look at the query to fetch all children
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
with recursive cte as (
select
id, name, level, parent_id
from
categories c
where
c.id = 128
-- initial selector
union
select
c2.id, c2.name, c2.level, c2.parent_id
from
categories c2
join cte on
c2.parent_id = cte.id
-- recursive part
)
select
*
from
cte;
A recursive query consists of two parts. First part where we select the initial rows(s) and the second part where we select more rows based on the rows from the upper part. Also note that the with
statement has recursive
keyword as well.
In the first part we selected a category with id 128 i.e. our Clothing and Accessories category. For sake of explanation, let’s “assign” this result to a temporary table called cte
.
Now in the second part, select rows form categories table by joining with cte
where categories in cte
table is parent of these categories. This filter is given by the join condition c2.parent_id = cte.id
. This join will match the categories [Boys, Girls, Men, Women]
. Again for simplicity, let’s assume all these 4 categories are accumulated in cte
table.
Upto now we have our “main” category as well as direct children of this category. Since this is a recursive query, it is not going to end just now. It will again join the categories
table with cte
. This time, the query is basically saying give me categories which are children of categories in cte
table i.e. children of [Boys, Girls, Men, Women]
. This process is repeated until no results are returned.
Recursive SQLAlchemy
Writing this query in SQLAlchemy is also pretty straightforward. We first define a CTE with recursive=True
for the top portion of the query. We then define the bottom part of the query by joining it with the top part and finally applying a union
function.
1
2
3
4
5
6
7
8
9
topq = sess.query(Category)
topq = topq.filter(Category.id == 128)
topq = topq.cte('cte', recursive=True)
bottomq = sess.query(Category)
bottomq = bottomq.join(topq, Category.parent_id == topq.c.id)
recursive_q = topq.union(bottomq)
q = sess.query(recursive_q)
I hope this post helped you in understanding recursive query. Please share this if you enjoyed it.
Comments