Intro

Recursive query is a powerful feature that allows us to query hierarchical data. Let’s consider an example to store product taxonomy for an e-commerce database. Let’s assume we have a table called categories that contains columns id, name, level, parent_id, where level indicates the depth of this cateogry and parent_id is a reference to another entry in categories table that is a parent of this entry (self-referencing).

Below is an example data.

id name level parent_id
128 Clothing And Accessories 0  
89 Boys 1 128
94 Girls 1 128
70 Men 1 128
60 Women 1 128
103 Boys Clothing 2 89
59 Men Clothing 2 70
114 Men Footwear 2 70
119 Women Footwear 2 60
337 Boys Clothing Accessories 2 89
100 Boys Footwear 2 89
376 Girls Clothing Accessories 2 94
399 Girls Footwear 2 94

We have Clothing And Accessories as our top-level category which has no parent i.e parent_id is NULL and its level is 0.

Let’s say we wanted to find out all categories under Clothing and Accessories. Getting direct descendents is easy. We just filter on parent_id

1
SELECT * FROM categories WHERE parent_id = 128

But we want to get everything under the given category. Without recursive query we would have to write this logic in our code but thankfully we can do it using SQL query only.

Recursive SQL

Let’s look at the query to fetch all children

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
with recursive cte as (
select
	id, name, level, parent_id
from
	categories c
where
	c.id = 128
	-- initial selector
union
select
	c2.id, c2.name, c2.level, c2.parent_id
from
	categories c2
join cte on
	c2.parent_id = cte.id
	-- recursive part
)
select
	*
from
	cte;

A recursive query consists of two parts. First part where we select the initial rows(s) and the second part where we select more rows based on the rows from the upper part. Also note that the with statement has recursive keyword as well.

In the first part we selected a category with id 128 i.e. our Clothing and Accessories category. For sake of explanation, let’s “assign” this result to a temporary table called cte.

Now in the second part, select rows form categories table by joining with cte where categories in cte table is parent of these categories. This filter is given by the join condition c2.parent_id = cte.id. This join will match the categories [Boys, Girls, Men, Women]. Again for simplicity, let’s assume all these 4 categories are accumulated in cte table.

Upto now we have our “main” category as well as direct children of this category. Since this is a recursive query, it is not going to end just now. It will again join the categories table with cte. This time, the query is basically saying give me categories which are children of categories in cte table i.e. children of [Boys, Girls, Men, Women]. This process is repeated until no results are returned.

Recursive SQLAlchemy

Writing this query in SQLAlchemy is also pretty straightforward. We first define a CTE with recursive=True for the top portion of the query. We then define the bottom part of the query by joining it with the top part and finally applying a union function.

1
2
3
4
5
6
7
8
9
topq = sess.query(Category)
topq = topq.filter(Category.id == 128)
topq = topq.cte('cte', recursive=True)

bottomq = sess.query(Category)
bottomq = bottomq.join(topq, Category.parent_id == topq.c.id)

recursive_q = topq.union(bottomq)
q = sess.query(recursive_q)

I hope this post helped you in understanding recursive query. Please share this if you enjoyed it.

Comments