Remix.run Logo
cryptonector 11 hours ago

SQL recursive queries are just loops. The CTE is a UNION or UNION ALL query (well, never use UNION ALL unless you really want to loop infinitely!) where one side is the query that seeds the CTE and the other is executed and its outputs added to the CTE, and the last part is repeated until the CTE stops growing. There's just one more detail: the "recursive" side of the UNION query needs to do a JOIN to the CTE itself (thus the "recursion"), but this doesn't change the loop nature of the thing.

That's it. It's just a simple loop.

(Yes, recursion is looping. But in this case it's very clear that a stack is not needed, that the recursion is "tail recursion" if know what that is.)

The hard part lies in getting the JOIN in the "recursive" side of the UNION query right. Here's a transitive closure query:

  WITH RECURSIVE closure AS (
    SELECT parent, child FROM relationships
    UNION
    SELECT c.parent AS ancestor, r.child AS descendant
    FROM closure c
    JOIN relationships r ON c.child = r.parent
  )
  SELECT * FROM closure;
Here `SELECT parent, child FROM relationships` is the seed, and the rest (`SELECT c.parent, r.child ...`) is the recursive part of the query that gets repeated until the whole set of results from the seed and all the repetitions of the recursive part stops growing. The recursive part simply says to add to the `closure` all the children of children of tuples in the closure, but with the parents from the closure as the parents. So if this were human parent/child relationships then initially you might have your parents' relationships to you, and also yours to your children, and this will find that your parents are ancestors to your children.
CGamesPlay 10 hours ago | parent [-]

> However, if the example had used UNION instead of UNION ALL, then SQLite would have had to keep around all previously generated content in order to check for duplicates. For this reason, programmers should strive to use UNION ALL instead of UNION when feasible.

cryptonector 9 hours ago | parent [-]

Of course, if you have loop detection in your business logic when writing to the database, then you can safely use UNION ALL always.

EDIT: Removed a bunch of stuff that was partly based on a faulty memory of how UNION works vs UNION ALL. Indeed, UNION ALL means the RDBMS does not need to keep around the whole result set.

CGamesPlay 9 hours ago | parent [-]

Just to be clear, the reason I mentioned this in response to your original post is that UNION ALL doesn't have anything to do with the query recursing infinitely: it just allows the database engine to not keep previous results around. If you have a query that recurses infinitely, with UNION it will definitely exhaust the memory of the database engine, and with UNION ALL it will never stop returning results, which may exhaust the memory of the database client (unless it discards the results).