Summary
#21912 introduced a SQL behavior change: an existing recursive CTE SLT had to be amended from 0 AS level to SUM(0) AS level to pass.
A follow-up fix is required because recursive CTE physical alignment must preserve the CTE's declared/static schema, including nullability, rather than widening the recursive CTE output schema and forcing SQL rewrites.
Background
Issue #21910 is about preventing recursive CTE batches from leaking recursive-term field names when the recursive term is planned independently from the anchor/static term.
#21912 adds a reusable execution-layer schema helper and wires recursive CTE execution to avoid ad hoc schema rebinding. During review/testing, another problem appeared:
- The anchor/static term can declare a non-null field, e.g.
0 AS level.
- The recursive term can compute the same column through an expression that is nullable, e.g.
MIN(rs.level) + 1.
- If
RecursiveQueryExec declares a widened schema using static_nullable || recursive_nullable, downstream planning sees a physical schema that no longer matches the logical/static CTE schema.
- Changing SQL from
0 AS level to SUM(0) AS level makes the logical anchor nullable too, but only hides the mismatch.
Evidence
The SQL behavior change is visible in the amended SLT test:
diff --git a/datafusion/sqllogictest/test_files/cte.slt b/datafusion/sqllogictest/test_files/cte.slt
index e9c1c0245..d13e0d4f0 100644
--- a/datafusion/sqllogictest/test_files/cte.slt
+++ b/datafusion/sqllogictest/test_files/cte.slt
@@ -699,7 +699,7 @@ WITH RECURSIVE region_sales AS (
SELECT
s.salesperson_id AS salesperson_id,
SUM(s.sale_amount) AS amount,
- 0 as level
+ SUM(0) as level
FROM
sales s
GROUP BY
Reverting this SLT query change from:
to the original:
fails with:
Physical input schema should be the same as the one converted from logical input schema.
Differences:
- field nullability at index 2 [level]: (physical) true vs (logical) false.
Summary
#21912 introduced a SQL behavior change: an existing recursive CTE SLT had to be amended from
0 AS leveltoSUM(0) AS levelto pass.A follow-up fix is required because recursive CTE physical alignment must preserve the CTE's declared/static schema, including nullability, rather than widening the recursive CTE output schema and forcing SQL rewrites.
Background
Issue #21910 is about preventing recursive CTE batches from leaking recursive-term field names when the recursive term is planned independently from the anchor/static term.
#21912 adds a reusable execution-layer schema helper and wires recursive CTE execution to avoid ad hoc schema rebinding. During review/testing, another problem appeared:
0 AS level.MIN(rs.level) + 1.RecursiveQueryExecdeclares a widened schema usingstatic_nullable || recursive_nullable, downstream planning sees a physical schema that no longer matches the logical/static CTE schema.0 AS leveltoSUM(0) AS levelmakes the logical anchor nullable too, but only hides the mismatch.Evidence
The SQL behavior change is visible in the amended SLT test:
Reverting this SLT query change from:
to the original:
fails with: