We begin with a description of recursive CTEs, a SQL pattern that will be used afterwards.

**What is a CTE ?**

A CTE (Common table expression) is a view local to a query : a substitute for a view when the general use of a view is not required. There is no definition stored in metadata.

The structure of CTE is :

with name_of_CTE as ( Body of CTE ) SQL statement using the CTE (called the outside query)

An example with the classical table employees mentioning salary and department : in the body of CTE, we select the highest salary of the department and in the calling query, we select all the columns of those employees.

with find_max_salary(max_salary) as ( select department, max(salary) as max_salary from employees group by department ) select e.* from employees e inner join find_max_salary f on e.department = f.department where e.salary = f.max_salary;

**What is recursion ?**

Recursion is a method where the solution to a problem depends on solutions to smaller instances of the same problem.

An example :

Factorial(n) = 1 if n = 1 = n * Factorial(n - 1) if n > 1

In the definition of Factorial, we find Factorial again but with a smaller argument. The first value (for n = 1) is called the base case.

Let’s apply the definition for n = 4 :

Factorial(4) = 4 * Factorial(3) = 4 * 3 * Factorial(2) = 4 * 3 * 2 * Factorial(1) (Base case) = 4 * 3 * 2 * 1 = 24

**What is a recursive CTE ?**

A recursive CTE is one in which an initial CTE is repeatedly executed to return subsets of data until the complete result set is obtained.

-- -- structure of recursive CTE -- with recursive t(n) as ( select 1 -- non-recursive term : the base case union all select n + 1 from t -- recursive term where t < x -- halting condition ) select n from t;

Here some examples :

-- -- a counter -- with recursive source (counter) as ( select 1 union all select counter + 1 from source where counter < 5 ) select * from source;

gives

counter ------- 1 2 3 4 5

-- -- sum of integers until 64 -- with recursive sigma(n) as ( values 1 union all select n + 1 from sigma where n < 64 ) select sum(n) from sigma;

gives

sigma ----- 2080

-- -- factorials until 10 -- with recursive factorial(counter, product) AS ( select 1, 1 union all select counter + 1, product * (counter + 1) from factorial where counter < 10 ) select counter, product from factorial;

gives

counter product ------- ------- 1 1 2 2 3 6 4 24 5 120 6 720 7 5040 8 40320 9 362880 10 3628800

**The tree, a recursive data structure**

A tree t consists of a value v and a list of other trees :

t[Root] : v [t[1], ..., t[k]] where t[i] can be empty

We want to implement the following tree giving the logical connection between some concepts of relational databases :

Root Relational theory First-order predicate logic Syntax Rules of inference Deductive systems Semantics Three-valued logic Normal forms Functional dependencies Multivalued dependencies 1NF 2NF 3NF BCNF 4NF 5NF DKNF SQL Types and domains PKs & FKs Relations Operators Restriction Projection Joins Cross join Inner join Left outer join Right outer join Full outer join Union Intersection Difference Divide Aggregations Constraints Views

We create a table topics with a reference to itself in order to allow recursion. This implementation is called the Adjacency List Model. We will explore other models like the Nested Sets model, the Nested Intervals Model and the Materialized Path Model like the PostgreSQL extension ltree in other posts.

id_topic_up is a foreign key to id_topic. It means that the topic having this id_topic_up is a subtopic of the topic having this id_topic.

drop table if exists topics; create table topics ( id_topic integer not null primary key ,id_topic_up integer references topics (id_topic) ,topic_name varchar(32) ,check(id_topic != id_topic_up) -- avoid cycle ); insert into topics values ( 1, NULL, 'Root'); -- subtopics of root insert into topics values ( 2, 1, 'Relational theory'); insert into topics values ( 3, 1, 'SQL'); -- -- level 2 -- -- subtopics of Relational theory insert into topics values ( 4, 2, 'First-order predicate logic'); insert into topics values ( 5, 2, 'Three-valued logic'); insert into topics values ( 6, 2, 'Normal forms'); -- subtopics of SQL insert into topics values ( 7, 3, 'Types and domains'); insert into topics values ( 8, 3, 'PKs & FKs'); insert into topics values ( 9, 3, 'Relations'); insert into topics values (10, 3, 'Operators'); insert into topics values (11, 3, 'Constraints'); insert into topics values (12, 3, 'Views'); -- -- level 3 -- -- subtopics of First-order predicate logic insert into topics values (13, 4, 'Syntax'); insert into topics values (14, 4, 'Rules of inference'); insert into topics values (15, 4, 'Deductive systems'); insert into topics values (16, 4, 'Semantics'); -- subtopics of Normal forms insert into topics values (17, 6, 'Functional dependencies'); insert into topics values (18, 6, 'Multivalued dependencies'); insert into topics values (19, 6, '1NF'); insert into topics values (20, 6, '2NF'); insert into topics values (21, 6, '3NF'); insert into topics values (22, 6, 'BCNF'); insert into topics values (23, 6, '4NF'); insert into topics values (24, 6, '5NF'); insert into topics values (25, 6, 'DKNF'); -- subtopics of Operators insert into topics values (26, 10, 'Restriction'); insert into topics values (27, 10, 'Projection'); insert into topics values (28, 10, 'Joins'); insert into topics values (29, 10, 'Union'); insert into topics values (30, 10, 'Intersection'); insert into topics values (31, 10, 'Difference'); insert into topics values (32, 10, 'Divide'); insert into topics values (33, 10, 'Aggregations'); -- -- level 4 -- -- subtopics of of Joins insert into topics values (34, 28, 'Cross join'); insert into topics values (35, 28, 'Inner join'); insert into topics values (36, 28, 'Left outer join'); insert into topics values (37, 28, 'Right outer join'); insert into topics values (38, 28, 'Full outer join');

Now, some recursive queries on this tree. As usual in recursive CTEs, the first SELECT of the body gives the base case and the second SELECT populates the result set.

The following query implements a breadth-first search :

-- -- give the list breadth-first /-separated -- with recursive tree(data, id_topic, level, full_path) as ( select topic_name ,id_topic ,0 ,'' from topics where id_topic_up is null union all select topic_name ,topics.id_topic ,tree.level + 1 ,tree.full_path || ' / ' || topics.topic_name from topics inner join tree on tree.id_topic = topics.id_topic_up ) select full_path from tree where level != 0 order by level, id_topic;

The base case begins with level = 0 and an empty full path of topics. The recursive term (the select after union all) increments the level and adds a topic_name in the full path of topics. The result of this query is :

/ Relational theory / SQL / Relational theory / First-order predicate logic / Relational theory / Three-valued logic / Relational theory / Normal forms / SQL / Types and domains / SQL / PKs & FKs / SQL / Relations / SQL / Operators / SQL / Constraints / SQL / Views / Relational theory / First-order predicate logic / Syntax / Relational theory / First-order predicate logic / Rules of inference / Relational theory / First-order predicate logic / Deductive systems / Relational theory / First-order predicate logic / Semantics / Relational theory / Normal forms / Functional dependencies / Relational theory / Normal forms / Multivalued dependencies / Relational theory / Normal forms / 1NF / Relational theory / Normal forms / 2NF / Relational theory / Normal forms / 3NF / Relational theory / Normal forms / BCNF / Relational theory / Normal forms / 4NF / Relational theory / Normal forms / 5NF / Relational theory / Normal forms / DKNF / SQL / Operators / Restriction / SQL / Operators / Projection / SQL / Operators / Joins / SQL / Operators / Union / SQL / Operators / Intersection / SQL / Operators / Difference / SQL / Operators / Divide / SQL / Operators / Aggregations / SQL / Operators / Joins / Cross join / SQL / Operators / Joins / Inner join / SQL / Operators / Joins / Left outer join / SQL / Operators / Joins / Right outer join / SQL / Operators / Joins / Full outer join

**Generation of a Table of contents**

We need an implementation of the depth-first search. We have to modify slightly the breadth-first query by adding a vector of the descendants which will be used as a sort key. We initialize this vector with

```
ARRAY[]::INTEGER[]
```

and we recursively populate it with

`sort_key || topics.id_topic`

-- -- give the list depth-first /-separated -- with recursive tree(data, id_topic, level, full_path, sort_key) as ( select topic_name ,id_topic ,0 ,'' ,array[]::integer[] from topics where id_topic_up is null union all select topics.topic_name ,topics.id_topic ,level + 1 ,tree.full_path || ' / ' || topics.topic_name ,sort_key || topics.id_topic from tree inner join topics on topics.id_topic_up = tree.id_topic ) select level, repeat(' ', level - 1) || data from tree where level != 0 order by sort_key;

The sort_key has the following values :

{1,2} {1,2,4} {1,2,4,13} {1,2,4,14} {1,2,4,15} {1,2,4,16} {1,2,5} {1,2,6} {1,2,6,17} {1,2,6,18} {1,2,6,19} {1,2,6,20} {1,2,6,21} {1,2,6,22} {1,2,6,23} {1,2,6,24} {1,2,6,25} {1,3} {1,3,7} {1,3,8} {1,3,9} {1,3,10} {1,3,10,26} {1,3,10,27} {1,3,10,28} {1,3,10,28,34} {1,3,10,28,35} {1,3,10,28,36} {1,3,10,28,37} {1,3,10,28,38} {1,3,10,29} {1,3,10,30} {1,3,10,31} {1,3,10,32} {1,3,10,33}

The query gives :

Relational theory First-order predicate logic Syntax Rules of inference Deductive systems Semantics Three-valued logic Normal forms Functional dependencies Multivalued dependencies 1NF 2NF 3NF BCNF 4NF 5NF DKNF SQL Types and domains PKs & FKs Relations Operators Restriction Projection Joins Cross join Inner join Left outer join Right outer join Full outer join Union Intersection Difference Divide Aggregations Constraints Views

It will be nicer with hierarchical numbering of the topics.

-- -- generate Table of contents with hierarchical numbering of topics -- with recursive numbered_topics( id_topic, id_topic_up, topic_name, topics_sequence) as ( select id_topic ,id_topic_up ,topic_name ,row_number() over (partition by id_topic_up order by id_topic) as topics_sequence from topics ), hierarchy(id_topic, numbering) as ( select id_topic ,cast(topics_sequence as varchar(128)) from numbered_topics where id_topic_up = 1 union all select numbered_topics.id_topic ,cast(case when hierarchy.numbering = '' then(cast(numbered_topics.topics_sequence as varchar(128))) else(hierarchy.numbering || '.' || cast(numbered_topics.topics_sequence as varchar(128))) end as varchar(128)) from numbered_topics inner join hierarchy on numbered_topics.id_topic_up = hierarchy.id_topic ) select numbering , repeat(' ', length(numbering)) || topic_name from hierarchy join topics p, topics.id_topic = hierarchy.id_topic order by hierarchy.numbering;

We see in this query that we can write several CTEs in one WITH statement by separating them with a “,”.

The CTE numbered_topics, which is not recursive, gives the number of the topics belonging to the same category which is equal to row_number(), the position of a topic in the category, because we do a partition on a category order by a topic.

The CTE hierarchy does a breadth-first scan of the tree and populates the string numbering.

The outside select makes a join with the table topics to get the names of the topics.

We order the result of the select with this string numbering to produce a depth-first list of topics.

1 Relational theory 1.1 First-order predicate logic 1.1.1 Syntax 1.1.2 Rules of inference 1.1.3 Deductive systems 1.1.4 Semantics 1.2 Three-valued logic 1.3 Normal forms 1.3.1 Functional dependencies 1.3.2 Multivalued dependencies 1.3.3 1NF 1.3.4 2NF 1.3.5 3NF 1.3.6 BCNF 1.3.7 4NF 1.3.8 5NF 1.3.9 DKNF 2 SQL 2.1 Types and domains 2.2 PKs & FKs 2.3 Relations 2.4 Operators 2.4.1 Restriction 2.4.2 Projection 2.4.3 Joins 2.4.3.1 Cross join 2.4.3.2 Inner join 2.4.3.3 Left outer join 2.4.3.4 Right outer join 2.4.3.5 Full outer join 2.4.4 Union 2.4.5 Intersection 2.4.6 Difference 2.4.7 Divide 2.4.8 Aggregations 2.5 Constraints 2.6 Views

**Delete of a subtree**

As usual, we use a recursive CTE. We encapsulate this CTE in a function in order to call it with the ID of the root of the subtree. We use a vector of the ancestots that will be used for the delete in the outside query. We initialize this vector with

```
array[]::integer[]
```

and we recursively populate it with

```
tree.sequence_topics_id || topics.id_topic_up
```

create or replace function delete_subtree(subtree_root integer) returns void as $$ declare subtree_root alias for $1; begin with recursive tree as ( select -- initialization id_topic ,array[]::integer[] as sequence_topics_up from topics where id_topic_up is null union all select -- generation of the ancestors list topics.id_topic ,tree.sequence_topics_up || topics.id_topic_up from topics, tree where topics.id_topic_up = tree.id_topic ) delete from topics where id_topic in ( select id_topic from tree where subtree_root = any(tree.sequence_topics_up)) -- delete ancestors or id_topic = subtree_root; -- delete root of the subtree end; $$ language 'plpgsql';

and we call this function for the id of the topic ‘Joins’ for example :

select delete_subtree(28); select * from topics;

id_topic id_topic topic_name -------- -------- ---------- 1 0 Root 2 1 Relational theory 3 1 SQL 4 2 First-order predicate logic 5 2 Three-valued logic 6 2 Normal forms 7 3 Types and domains 8 3 PKs & FKs 9 3 Relations 10 3 Operators 11 3 Constraints 12 3 Views 13 4 Syntax 14 4 Rules of inference 15 4 Deductive systems 16 4 Semantics 17 6 Functional dependencies 18 6 Multivalued dependencies 19 6 1NF 20 6 2NF 21 6 3NF 22 6 BCNF 23 6 4NF 24 6 5NF 25 6 DKNF 26 10 Restriction 27 10 Projection 29 10 Union 30 10 Intersection 31 10 Difference 32 10 Divide 33 10 Aggregations

To add a new node or a new subtree is easy. The values of the PK do not have a semantic value. Only the PK/FK relations make sense.

**Some utilities to inspect the tree**

-- -- find leaves (topics without subtopics) -- select id_topic, topic_name from topics where id_topic not in (select id_topic_up from topics where id_topic_up is not null) order by id_topic;

id_topic topic_name -------- ---------- 5 Three-valued logic 7 Types and domains 8 PKs & FKs 9 Relations 11 Constraints 12 Views 13 Syntax 14 Rules of inference 15 Deductive systems 16 Semantics 17 Restriction 18 Projection 20 Union 21 Intersection 22 Difference 23 Divide 24 Aggregations 25 Operators 26 Functional dependencies 27 Multivalued dependencies 28 1NF 29 2NF 30 3NF 31 BCNF 32 4NF 33 5NF 34 DKNF 35 Cross join 36 Inner join 37 Left outer join 38 Right outer join 39 Full outer join

To find the nodes of the tree, the elements having children, we remove the NOT of the preceding query :

-- -- find nodes (topics having subtopics) -- select id_topic, topic_name from topics where id_topic in (select id_topic_up from topics where id_topic_up is not null) order by id_topic;

id_topic topic_name -------- ---------- 1 ALL 2 Relational theory 3 SQL 4 First-order predicate logic 6 Normal forms 10 Operators 19 Joins

-- -- isolated nodes ? -- select id_topic from topics where id_topic_up is null and id_topic not in ( select id_topic_up from topics where id_topic is not null);

returns no value.

It is just an exercise. If you want to build trees, it is better to use the ltree package.