Question

Recursive functions with Common Table Expressions

  • 12 February 2015
  • 1 reply
  • 751 views

Userlevel 3

Say your data is stored in a relational store using child to parent relationship (e.g. one row per unique pair of child and its direct parent). Imagine now that this is a generic N-ary Tree and it has indefinite number of levels. An example of such a tree can be observed in the category hierarchy of Foursquare.


Traversing this tree presents a problem. What if you need to match a child with its top-most parent? Or what if you need to match some category with all of its children (direct and non-direct)?


Below is a solution that returns the top parent for every category.


WITH RECURSIVE parent_category(parent_foursquare_id, parent_id, orig_foursquare_id ) AS (
SELECT
parent_foursquare_id
, foursquare_id AS parent_id
, foursquare_id AS orig_foursquare_id
FROM categories
UNION ALL
SELECT
-- next iteration's id
c.parent_foursquare_id
-- this iteration's id
, c.foursquare_id
-- carry forward the original category id to connect indirect parent/child nodes in one row
, pc.orig_foursquare_id
FROM
parent_category pc
, categories c
-- recursive step
WHERE c.foursquare_id = pc.parent_foursquare_id
)

SELECT
parent_id AS top_most_parent_id
, orig_foursquare_id AS category_id
FROM parent_category
WHERE
-- no more parents for this top_most_parent
-- without this line, category_id will be matched with all of its direct and indirect parents
parent_foursquare_id IS NULL
GROUP BY 1, 2

1 reply

Userlevel 3

Just a note on this. Many databases - not just postgres derived - will support this use of CTEs, and the recursive keyword is not always required. For example, this can be done with SqlServer too.


RedShift is currently an important exception. This does not work for RedShift.


Oracle can do similar things using the non-standard CONNECT BY … START WITH syntax.

Reply