How does a Recursive CTE know to only process records added during the last iteration (and not looking at all of them)?

Consider this [v good] example from Using recursive CTE to generate a range of data

WITH RECURSIVE date_range AS (
  SELECT '2023-01-01'::timestamp  AS date
  UNION ALL
  SELECT date + interval '1 day'
  FROM date_range
  WHERE date < '2023-12-31'
)
SELECT *
FROM date_range;

I can understand how to create this query, but do not understand the underlying logic/rules executed by the database engine. Specifically: for the select date + 'interval 1 day' from date_range: how does it know to only “consider” the very last added record [instead of taking all the previously added records and adding 1 to each of them]?

To illustrate more clearly: consider when we are at 2023-01-03 in the recursion. In that case we already presumably have 2023-01-01, 2023-01-02, and 2023-01-03 in the table. Why does the second query shown below not “find” three records and add 1 to all of those 3 records?

  SELECT date + interval '1 day'
  FROM date_range
  WHERE date < '2023-12-31'

Instead what is happening is the engine only “finds” the very last record. 2023-01-03 and adds one to that to generate the single new record 2023-01-04? Is that just something I take on “faith” for recursive views?

consider when we are at 2023-01-03 in the recursion. In that case we already presumably have 2023-01-01, 2023-01-02, and 2023-01-03 in the table. Why does the second query shown below not “find” three records and add 1 to all of those 3 records

Because there are 3 different tables (tuplestores) involved. Results keep getting accumulated and that’s where the 2023-01-01, 2023-01-02, and 2023-01-03 live – the query doesn’t target that. Instead it targets work_table which gets wiped and replaced with contents of intermediate_table from previous iteration – at that point it only holds 2023-01-03:

iteration	work_table	intermediate_table	results
1	(‘2023-01-01’)_{non-recursive term}	(‘2023-01-01’+interval ‘1 day’)	(‘2023-01-01’), (‘2023-01-02’)
2	(‘2023-01-02’)_{previous values from intermediate}	(‘2023-01-02’+interval ‘1 day’)	(‘2023-01-01’), (‘2023-01-02’), (‘2023-01-03’)
3	(‘2023-01-03’)_{previous values from intermediate}	(‘2023-01-03’+interval ‘1 day’)	(‘2023-01-01’), (‘2023-01-02’), (‘2023-01-03’), (‘2023-01-04’)

Behaviour above illustrates the description from the doc:

Evaluate the non-recursive term. (…) Include all remaining rows in the result of the recursive query, and also place them in a temporary working table.

So long as the working table is not empty, repeat these steps:

a) Evaluate the recursive term, substituting the current contents of the working table for the recursive self-reference. (…) Include all remaining rows in the result of the recursive query, and also place them in a temporary intermediate table.

b) Replace the contents of the working table with the contents of the intermediate table, then empty the intermediate table

The same description in src/backend/executor/nodeRecursiveunion.c responsible for the feature, sounds a bit more cryptic:

 * 1. evaluate non recursive term and assign the result to RT    //ok
 *
 * 2. execute recursive terms                                    //ok
 *
 * 2.1 WT := RT                                                  //ok
 * 2.2 while WT is not empty repeat 2.3 to 2.6. if WT is empty returns RT
 * 2.3 replace the name of recursive term with WT
 * 2.4 evaluate the recursive term and store into WT
 * 2.5 append WT to RT
 * 2.6 go back to 2.2

But the code really does what it said on the tin, how it said it. Here’s the 2.2, 2.3, 2.6 part, even matching the names from the doc:

    /* 2. Execute recursive term */
    for (;;)
    {
        slot = ExecProcNode(innerPlan);
        if (TupIsNull(slot))
        {
            /* Done if there's nothing in the intermediate table */
            if (node->intermediate_empty)
                break;

            /* done with old working table ... */
            tuplestore_end(node->working_table);

            /* intermediate table becomes working table */
            node->working_table = node->intermediate_table;

            /* create new empty intermediate table */
            node->intermediate_table = tuplestore_begin_heap(false, false,
                                                             work_mem);
            node->intermediate_empty = true;

            /* reset the recursive term */
            innerPlan->chgParam = bms_add_member(innerPlan->chgParam,
                                                 plan->wtParam);

            /* and continue fetching from recursive term */
            continue;
        }

If you think about it, the “taking all the previously added records and adding 1 to each of them” thing is exactly what it does. All added previously, as in previous iteration, not all processed since the beginning of operation. In your case, it started with a single row, so in each iteration all previously added records amount to 1. If you started with 2, you’d get 2 at a time, for as long as both match the WHERE condition.

From the link noted by @nbk postgresql Recursive Queries: the key step is this:

Replace the contents of the working table with the contents of the intermediate table, then empty the intermediate table.

So for the query mentioned in the question the intermediate table would apparently be only the newly generated result [and not any others previously generated by the recursion].

The recursion means calling a peace of code from the same peace of code. The most important thing regarding recursions is to define how and when to escape from it. It’s done here with the WHERE clause. Make sure, if you use any kind of recursions, to define a safe exit first to avoid indefinite loops.

a recursive cte is one peace of code that generates a set of rows. It has two parts that are unioned to give a complete resultset. But it is still one code declaring a cte. The delaration starts with:

WITH RECURSIVE date_range AS (

first part (the fixed part) in your example doesn’t change the results through recursion steps – it gives you a starting point with just that 1 row

SELECT '2023-01-01'::timestamp  AS date
/*
date
-------------------
2023-01-01 00:00:00

the second part uses union to add another row and checks is it the right time for safe escape from the recursion

UNION ALL
SELECT date + interval '1 day'
  FROM date_range
  WHERE date < '2023-01-04'    -- safe escape when date is < '2023-01-04'
/* 1st pass result
date
-------------------
2023-01-01 00:00:00     -- row from the first part ( read no. 2. )
2023-01-02 00:00:00     -- this is the value of date column at the moment

If it is not time to exit the recursion the code as a whole calls itself to do the union part again and check if it is time to escape

/* 2nd pass
date
-------------------
2023-01-01 00:00:00     -- row from the first part ( read no. 2. )
2023-01-02 00:00:00     -- a row from the first pass union
2023-01-03 00:00:00     -- this is the value of date column at the moment

that repeats until the escape condition is true – false is – get out of this right now ….

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: Kiến thức lập trình - @ 23:18

Thẻ: sqlpostgresqlrecursion

Thiết kế website giá rẻ

Danh mục

How does a Recursive CTE know to only process records added during the last iteration (and not looking at all of them)?