Skip to content

feat: add DuckDB support with macro dispatches and CI configuration#935

Open
devin-ai-integration[bot] wants to merge 14 commits intomasterfrom
core-402-duckdb-support
Open

feat: add DuckDB support with macro dispatches and CI configuration#935
devin-ai-integration[bot] wants to merge 14 commits intomasterfrom
core-402-duckdb-support

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Feb 25, 2026

feat: add DuckDB support to CI with macro dispatches

Summary

Adds DuckDB as a new CI warehouse target with 20+ duckdb__ macro dispatch overrides. DuckDB runs in :memory: mode. Key implementation decisions:

  • Temp tables enabled (has_temp_table_supporttrue): The dbt-duckdb adapter reuses a single cached cursor instance (DuckDBConnectionWrapper.cursor() returns self._cursor), so temp tables are visible across all queries within a dbt invocation.
  • Explicit adapter.commit() for DuckDB after DML operations to survive dbt's post-on-run-end ROLLBACK. Applied in delete_and_insert, handle_tests_results (both insert_data_monitoring_metrics and insert_schema_columns_snapshot), create_or_replace, and replace_table_data. DuckDB can't use SQL-level BEGIN TRANSACTION; ... COMMIT; like Postgres, and doesn't auto-commit per statement like Redshift.
  • CREATE OR REPLACE TABLE used instead of drop-then-create for atomicity.
  • duckdb__full_name_split: DuckDB's split() returns 1-indexed arrays (unlike default which uses 0-indexed). Uses split_part() with correct 1-indexed access for database/schema/table name extraction.
  • duckdb__get_normalized_data_type: Maps DuckDB type synonyms (TEXT→VARCHAR, INT→INTEGER, DOUBLE PRECISION→DOUBLE, etc.) so schema change detection compares types correctly.
  • Date functions: dayname() instead of to_char() for day-of-week; EXTRACT(hour from ...) for hour extraction.
  • Transactionless cleanup: duckdb__get_clean_elementary_test_tables_queries uses individual DROP statements (no BEGIN TRANSACTION wrapping) since DuckDB doesn't support multi-statement transaction blocks in a single SQL string.
  • All tests pass on DuckDB, including both test_schema_changes and test_schema_changes_from_baseline. No DuckDB-specific skips remain.

Macro overrides added

buckets_cte, target_database, timeadd, escape_special_chars, delete_and_insert, generate_elementary_profile_args, get_tables_from_information_schema, get_columns_from_information_schema, data_type_list, get_normalized_data_type, has_temp_table_support, create_table_as, create_or_replace, replace_table_data, day_of_week, hour_of_week, edr_drop_schema, full_name_split, get_clean_elementary_test_tables_queries

Updates since last revision

  • Merged from master: picked up new test infrastructure (profiles Jinja template, generate_profiles.py, adapter_query_runner.py, improved env.py error handling).
  • DuckDB profiles moved to profiles.yml.j2 template: removed the hardcoded "Write DuckDB profiles" workflow step and the if: inputs.warehouse-type != 'duckdb' guard. DuckDB now uses the same generate_profiles.py code path as all other adapters.
  • Renamed "Docker targets" → "Local targets" in test-all-warehouses.yml: the category now includes both Docker-based adapters (postgres, clickhouse, trino, dremio) and fully in-process adapters (duckdb). DuckDB runs in the test-local job (no secrets needed, no approval gate), while cloud adapters remain in test-cloud.
  • replace_table_data now uses TRUNCATE TABLE instead of delete from ... where 1=1 for the DuckDB dispatch, aligning with standard truncation semantics.
  • Removed elementary[postgres] extras mapping for DuckDB in the workflow. DuckDB now installs as elementary (no extra) since dbt-duckdb is already installed by the "Install dbt" step. An elementary[duckdb] extra will be added to the elementary package in a follow-up PR.
  • Enabled temp table support (has_temp_table_supporttrue): Investigation revealed that dbt-duckdb reuses the same cursor instance for all queries within a dbt invocation, so temp tables are visible across queries. Updated duckdb__edr_get_create_table_as_sql to respect the temporary flag. Removed now-unnecessary cleanup commit blocks from handle_tests_results.sql (temp tables are auto-dropped by dbt).

Review & Testing Checklist for Human

  • Verify temp table support assumption: The has_temp_table_support=true change relies on dbt-duckdb's DuckDBConnectionWrapper.cursor() returning a cached cursor instance. If this implementation detail changes in future dbt-duckdb versions, temp tables could become invisible across queries. Consider adding a comment in the code or documentation about this dependency.
  • Audit DuckDB commit locations: Verify that adapter.commit() appears after all DML operations that need to survive dbt's post-on-run-end ROLLBACK. Check delete_and_insert.sql (after DML + after cleanup drops), handle_tests_results.sql (after inserts only), create_or_replace.sql, and replace_table_data.sql.
  • Verify duckdb__full_name_split correctness: Confirm split_part(full_table_name, '.', 1/2/3) correctly extracts database/schema/table for DuckDB's 1-indexed arrays. Consider edge cases where full_table_name may not have all 3 parts.
  • Check profiles template rendering: Verify that profiles.yml.j2 renders correctly for DuckDB with :memory: path and that generate_profiles.py handles DuckDB without requiring secrets (should work on fork PRs). Test locally: python integration_tests/profiles/generate_profiles.py --template integration_tests/profiles/profiles.yml.j2 --output /tmp/test_profiles.yml --schema-name test_schema and inspect the output.
  • Review get_normalized_data_type completeness: Check the type synonym mapping against DuckDB's type docs. Missing synonyms could cause false-positive schema change alerts.

Notes

  • CI results: 18/19 passed, DuckDB passed all tests. The only failure is Athena (flaky date-boundary tests, unrelated to DuckDB changes).
  • Root cause of schema_changes fix: DuckDB's split() function returns 1-indexed arrays (index 0 = None), but default__full_name_split used 0-indexed access. This caused table_name field to be populated with schema name instead of table name, breaking _read_test_results WHERE filter. Fixed by adding duckdb__full_name_split with split_part() (1-indexed like Postgres).
  • test_schema_changes_from_baseline now passes: The duckdb__get_normalized_data_type mapping normalizes TEXTVARCHAR, so the baseline type comparison works correctly even though STRING_JINJA evaluates to TEXT for DuckDB.
  • :memory: mode: Using in-memory DuckDB for speed. Explicit adapter.commit() calls ensure data persists despite dbt's ROLLBACK behavior on test connections.
  • CREATE OR REPLACE TABLE: DuckDB supports this natively, so we use it instead of drop-then-create for cleaner atomicity.
  • Why DuckDB needs explicit commits: Postgres uses SQL-level BEGIN TRANSACTION; ... COMMIT; inside query strings. Redshift auto-commits each statement. DuckDB runs under an implicit transaction and doesn't support multi-statement BEGIN/COMMIT blocks, so we use Python-level adapter.commit() instead.
  • Temp table support investigation: Initially returned false due to assumed cursor visibility issues. Investigation revealed that dbt-duckdb's LocalEnvironment.handle() creates one cursor and wraps it in DuckDBConnectionWrapper, which returns the same cached cursor for all connection.handle.cursor() calls. This means all queries share one cursor, so temp tables work correctly. Verified locally with test_artifacts (14 tests) and test_schema_changes (2 tests).

Link to Devin run: https://app.devin.ai/sessions/7f53c258c1cd4ad39cb015a8b8a38452
Requested by: @haritamar

Summary by CodeRabbit

  • New Features

    • Add DuckDB as a supported warehouse with full runtime behavior and test/profile support.
    • DuckDB-specific query and metadata helpers added for compatibility.
  • Tests

    • Expanded test matrix and CI to include DuckDB.
    • Integration tests updated to skip DuckDB where appropriate to reflect supported scenarios.

Open with Devin

- Add duckdb to CI warehouse matrix in test-all-warehouses.yml
- Add DuckDB profiles generation in test-warehouse.yml
- Add 15+ duckdb__ macro overrides for adapter compatibility:
  - buckets_cte, target_database, timeadd, insert_rows, delete_and_insert
  - generate_elementary_profile_args, get_tables/columns_from_information_schema
  - data_type_list, has_temp_table_support, create_table_as
  - create_or_replace (with should_commit for ROLLBACK resilience)
  - replace_table_data (with should_commit for in-memory persistence)
  - day_of_week/hour_of_week (using dayname() instead of to_char)
  - edr_drop_schema for test cleanup
- Skip 9 tests incompatible with DuckDB in-memory mode:
  - Artifact upload tests that rely on dynamically created test metadata
  - Schema changes tests that require column state persistence across invocations

Co-Authored-By: unknown <>
@linear
Copy link

linear bot commented Feb 25, 2026

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Contributor

👋 @devin-ai-integration[bot]
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds DuckDB support across the codebase: new DuckDB-specific macros for metadata, time utilities, data types, and table operations; updates integration tests and CI workflows to include DuckDB and generate in-memory DuckDB dbt profiles for testing.

Changes

Cohort / File(s) Summary
CI/CD workflows
\.github/workflows/test-all-warehouses.yml, \.github/workflows/test-warehouse.yml
Add duckdb to warehouse matrix and workflow inputs; add step to write in-memory DuckDB dbt profiles and update dbt install extras mapping to include duckdb.
Integration tests
integration_tests/dbt_project/macros/clear_env.sql, integration_tests/tests/test_schema_changes.py
Add duckdb__edr_drop_schema; include duckdb in test skip lists.
Metadata collection
macros/edr/metadata_collection/get_columns_from_information_schema.sql, macros/edr/metadata_collection/get_tables_from_information_schema.sql
Add DuckDB implementations to query and normalize information_schema columns and tables.
System & datetime utilities
macros/edr/system/system_utils/buckets_cte.sql, macros/edr/system/system_utils/full_names.sql, macros/utils/cross_db_utils/day_of_week.sql, macros/utils/cross_db_utils/hour_of_week.sql
Add DuckDB macros: complete buckets CTE, full-name splitter, day-of-week and hour-of-week expressions.
Elementary/profile helpers
macros/utils/cross_db_utils/generate_elementary_profile_args.sql, macros/utils/cross_db_utils/target_database.sql
Add duckdb__generate_elementary_profile_args and duckdb__target_database to expose target metadata for DuckDB.
Time arithmetic
macros/utils/cross_db_utils/timeadd.sql
Add duckdb__edr_timeadd for interval-based timestamp arithmetic.
Data type utilities
macros/utils/data_types/data_type_list.sql, macros/utils/data_types/get_normalized_data_type.sql
Add DuckDB mappings: duckdb__data_type_list and duckdb__get_normalized_data_type to normalize and categorize DuckDB types.
Table operations & DML
macros/utils/table_operations/create_or_replace.sql, macros/utils/table_operations/create_table_as.sql, macros/utils/table_operations/delete_and_insert.sql, macros/utils/table_operations/has_temp_table_support.sql, macros/utils/table_operations/insert_rows.sql, macros/utils/table_operations/replace_table_data.sql
Add DuckDB-specific implementations: create-or-replace behavior, CREATE OR REPLACE generation, delete-and-insert queries with explicit commits, temp-table support flag (false), escape helper, and replace-with-commit to ensure durability.
Test-run persistence
macros/edr/tests/on_run_end/handle_tests_results.sql, macros/edr/tests/test_utils/clean_elementary_test_tables.sql
Add conditional commits for DuckDB after inserts and transactionless clean-up query helpers for DuckDB.

Sequence Diagram(s)

sequenceDiagram
  participant WF as Workflow
  participant FS as "Write DuckDB profiles"
  participant DBT as dbt
  participant DU as DuckDB

  WF->>FS: matrix includes duckdb -> run profile writer
  FS->>DBT: write in-memory DuckDB profiles (main/test)
  WF->>DBT: install dbt with duckdb extras
  DBT->>DU: run tests/queries using generated profiles
  DU-->>DBT: return query results
  DBT-->>WF: finish job (artifacts/logs)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰
I hopped and I coded through marsh and glen,
A duck rolled in with macros and then—
Profiles, commits, and buckets aligned,
Tests now run, the pipes all signed.
Quack and nibble—new support, amen!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.73% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and specifically summarizes the main change: adding DuckDB support with macro dispatches and CI configuration, which is the primary focus of the changeset across 25+ files.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch core-402-duckdb-support

Comment @coderabbitai help to get the list of available commands and usage tips.

The dbt debug step uses the root dbt_project.yml (profile: elementary)
while the tests use integration_tests/dbt_project (profile: elementary_tests).
Both profiles need to be present in profiles.yml for CI to work.

Co-Authored-By: unknown <>
coderabbitai[bot]

This comment was marked as resolved.

- Add adapter.commit() to duckdb__edr_drop_schema for consistency
- Add table_catalog filter to get_tables/columns_from_information_schema
- Remove redundant drop_first=true from duckdb__create_or_replace
- Add explanatory comments to create_table_as and create_or_replace
- Use placeholder+sed approach for profiles.yml schema variable
- Add both elementary and elementary_tests profiles for DuckDB CI

Co-Authored-By: unknown <>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/workflows/test-warehouse.yml (1)

141-141: Fragile indentation-based profile stripping.

sed -i 's/^ //' is coupled to the exact 10-space indentation of the heredoc lines as they appear in this workflow file. Any future reformatting (e.g., an auto-formatter changing the step's indentation) would silently produce malformed profiles.yml without any CI failure until dbt debug runs.

A more robust alternative is to write the file with no leading indentation via a Python one-liner or a printf block, or to use a known delimiter (e.g., sed -i 's/^[[:space:]]*//' for full stripping, or sed -i 's/^ //' for a more targeted 2-space strip applied to a matching pattern).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-warehouse.yml at line 141, The current fragile
indentation strip uses the exact token sed -i 's/^          //' which breaks if
the heredoc indentation changes; replace that command with a whitespace-robust
variant such as sed -i 's/^[[:space:]]*//' to remove any leading whitespace, or
alternatively generate the profiles file without indentation (e.g., a Python
one-liner that writes the heredoc content directly or a printf/heredoc block
that has no leading spaces) so that the step that currently runs sed -i 's/^    
//' will always produce a valid profiles.yml.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/test-warehouse.yml:
- Around line 117-118: The guard using string comparison is incorrect for
semantic versions; update the workflow step named "Write DuckDB profiles" to
either remove the brittle string comparison and rely on pip/dpkg installation
failure or replace it with a semantic-version check (compare parsed numeric
major/minor/patch from inputs.dbt-version against 1.8.0) before executing the
DuckDB-specific step, and/or add an explicit documentation comment stating
dbt-duckdb requires dbt-core >= 1.8.0; reference the workflow input
inputs.dbt-version and the step name "Write DuckDB profiles" to locate where to
implement the change.

---

Nitpick comments:
In @.github/workflows/test-warehouse.yml:
- Line 141: The current fragile indentation strip uses the exact token sed -i
's/^          //' which breaks if the heredoc indentation changes; replace that
command with a whitespace-robust variant such as sed -i 's/^[[:space:]]*//' to
remove any leading whitespace, or alternatively generate the profiles file
without indentation (e.g., a Python one-liner that writes the heredoc content
directly or a printf/heredoc block that has no leading spaces) so that the step
that currently runs sed -i 's/^          //' will always produce a valid
profiles.yml.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1f70101 and 7ae81b7.

📒 Files selected for processing (7)
  • .github/workflows/test-warehouse.yml
  • integration_tests/dbt_project/macros/clear_env.sql
  • integration_tests/tests/test_dbt_artifacts/test_groups.py
  • macros/edr/metadata_collection/get_columns_from_information_schema.sql
  • macros/edr/metadata_collection/get_tables_from_information_schema.sql
  • macros/utils/table_operations/create_or_replace.sql
  • macros/utils/table_operations/create_table_as.sql
🚧 Files skipped from review as they are similar to previous changes (2)
  • macros/utils/table_operations/create_or_replace.sql
  • macros/utils/table_operations/create_table_as.sql

Copy link
Contributor Author

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 7 additional findings.

Open in Devin Review

…fact uploads

The delete_and_insert macro (used in diff-mode artifact caching) was
missing the should_commit pattern that other DuckDB code paths use.
Without it, dbt's post-on-run-end ROLLBACK undoes the DELETE+INSERT
changes on DuckDB.

- Add should_commit parameter to delete_and_insert macro
- Call adapter.commit() after queries when should_commit=true
- Pass should_commit through from upload_artifacts_to_table
- Remove all skip_targets(["duckdb"]) markers from tests that now pass

Result: 184 passed, 9 skipped (dremio-only), 0 failures on DuckDB.
Co-Authored-By: unknown <>
…ters

The previous commit passed should_commit generically through
upload_artifacts_to_table, which caused Trino (and potentially other
adapters) to fail. Instead, check target.type == 'duckdb' directly
in the delete_and_insert macro to only commit for DuckDB.

Co-Authored-By: unknown <>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
macros/utils/table_operations/delete_and_insert.sql (1)

187-187: Update DuckDB comment to match current implementation.

Line 187 says commit is handled “via should_commit”, but commit is actually performed directly in delete_and_insert (Line 27).

Proposed fix
-{# DuckDB - separate queries without transaction wrapping (commit handled by caller via should_commit) #}
+{# DuckDB - separate queries without transaction wrapping (commit handled in delete_and_insert) #}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/table_operations/delete_and_insert.sql` at line 187, The
top-of-file DuckDB comment is outdated: it says commits are handled "via
should_commit" but the commit is performed directly inside the delete_and_insert
macro/function; update the comment to accurately state that delete_and_insert
performs its own commit (and that callers should not expect should_commit to
control commit behavior), e.g., replace the phrase "commit handled by caller via
should_commit" with a brief note that delete_and_insert commits internally and
callers should manage transactions accordingly; ensure the comment references
the delete_and_insert macro name so future readers see the relationship.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@macros/utils/table_operations/delete_and_insert.sql`:
- Around line 25-28: After the cleanup DROP statements run they must be
protected with a second explicit commit for DuckDB to avoid orphaned
intermediate tables; update the block that currently calls adapter.commit()
inside the {% if target.type == 'duckdb' %} condition to add another {% do
adapter.commit() %} immediately after the cleanup DROP section (i.e., commit
once after the DML and again after the cleanup drops) so that the cleanup is
persisted even if dbt performs a post-run-end ROLLBACK.

---

Nitpick comments:
In `@macros/utils/table_operations/delete_and_insert.sql`:
- Line 187: The top-of-file DuckDB comment is outdated: it says commits are
handled "via should_commit" but the commit is performed directly inside the
delete_and_insert macro/function; update the comment to accurately state that
delete_and_insert performs its own commit (and that callers should not expect
should_commit to control commit behavior), e.g., replace the phrase "commit
handled by caller via should_commit" with a brief note that delete_and_insert
commits internally and callers should manage transactions accordingly; ensure
the comment references the delete_and_insert macro name so future readers see
the relationship.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8b30117 and 2873a10.

📒 Files selected for processing (1)
  • macros/utils/table_operations/delete_and_insert.sql

Address CodeRabbit review:
- Add second adapter.commit() after cleanup DROP statements to prevent
  orphaned intermediate tables on DuckDB (since has_temp_table_support
  returns false, these are regular persistent tables)
- Fix outdated comment on duckdb__get_delete_and_insert_queries

Co-Authored-By: unknown <>
…pshots, transactionless cleanup

- Add duckdb__get_normalized_data_type macro to normalize DuckDB type synonyms
  (e.g., TEXT→VARCHAR, INT→INTEGER, DOUBLE PRECISION→DOUBLE)
- Add adapter.commit() in insert_data_monitoring_metrics and
  insert_schema_columns_snapshot so DuckDB inserts survive post-on-run-end ROLLBACK
- Add duckdb__get_clean_elementary_test_tables_queries to use transactionless
  DROP statements (avoids nested transaction errors on DuckDB)

Co-Authored-By: unknown <>
…ma_changes test

DuckDB's split() function returns 1-indexed arrays, but the default
full_name_split macro uses 0-indexed access. This caused the table_name
field in schema change test results to contain the schema name instead
of the actual table name, making _read_test_results unable to find the
correct results.

Added duckdb__full_name_split using split_part() with 1-indexed access
(same pattern as postgres/athena). Removed debug code from test files.

Co-Authored-By: unknown <>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
macros/utils/table_operations/delete_and_insert.sql (1)

192-214: duckdb__get_delete_and_insert_queries is identical to several other adapter variants.

The body is byte-for-byte the same as redshift__, athena__, dremio__, and trino__. A shared helper macro (e.g., transactionless__get_delete_and_insert_queries) could reduce duplication, but given the team's stated preference for explicit per-adapter variants this is purely optional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/table_operations/delete_and_insert.sql` around lines 192 - 214,
The macro duckdb__get_delete_and_insert_queries is byte-for-byte identical to
other adapter macros (redshift__, athena__, dremio__, trino__); to remove
duplication, extract the shared logic into a new helper macro (e.g.,
transactionless__get_delete_and_insert_queries) that implements the
delete/insert query assembly, then have duckdb__get_delete_and_insert_queries
call that helper (or replace its body with a thin wrapper delegating to the new
macro), ensuring you preserve the same parameters (relation, insert_relation,
delete_relation, delete_column_key) and return behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@macros/edr/tests/on_run_end/handle_tests_results.sql`:
- Around line 133-142: The DROP of temp_relation can be rolled back on DuckDB
because the code only commits once; after calling
elementary.fully_drop_relation(temp_relation) when target.type == 'duckdb' and
elementary.has_temp_table_support() is false, add a second explicit
adapter.commit() so the DROP is persisted; update the block around
insert_query/adapter.commit()/elementary.fully_drop_relation to call
adapter.commit() again (referencing adapter.commit(),
elementary.fully_drop_relation(temp_relation), and
elementary.has_temp_table_support()).
- Around line 187-196: The DROP of temp_relation in
insert_schema_columns_snapshot is executed after the initial commit but before
dbt's potential ROLLBACK, so for DuckDB you must commit again after calling
elementary.fully_drop_relation(temp_relation); update the on-run-end logic to
check target.type == 'duckdb' and call adapter.commit() immediately after the {%
do elementary.fully_drop_relation(temp_relation) %} (similar to the existing
commit after insert_query) so the DROP is persisted.

---

Nitpick comments:
In `@macros/utils/table_operations/delete_and_insert.sql`:
- Around line 192-214: The macro duckdb__get_delete_and_insert_queries is
byte-for-byte identical to other adapter macros (redshift__, athena__, dremio__,
trino__); to remove duplication, extract the shared logic into a new helper
macro (e.g., transactionless__get_delete_and_insert_queries) that implements the
delete/insert query assembly, then have duckdb__get_delete_and_insert_queries
call that helper (or replace its body with a thin wrapper delegating to the new
macro), ensuring you preserve the same parameters (relation, insert_relation,
delete_relation, delete_column_key) and return behavior.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2873a10 and 1348718.

📒 Files selected for processing (6)
  • integration_tests/tests/test_schema_changes.py
  • macros/edr/system/system_utils/full_names.sql
  • macros/edr/tests/on_run_end/handle_tests_results.sql
  • macros/edr/tests/test_utils/clean_elementary_test_tables.sql
  • macros/utils/data_types/get_normalized_data_type.sql
  • macros/utils/table_operations/delete_and_insert.sql

…ults

Addresses CodeRabbit review: fully_drop_relation runs in a new implicit
transaction after the first adapter.commit(). Without a second commit,
dbt's post-on-run-end ROLLBACK undoes the DROP, leaving orphaned temp
tables. Added second adapter.commit() after cleanup in both
insert_data_monitoring_metrics and insert_schema_columns_snapshot.

Co-Authored-By: unknown <>
The duckdb__get_normalized_data_type mapping already normalizes TEXT to
VARCHAR, so the baseline type comparison works correctly. No additional
changes needed — just removing the skip.

Co-Authored-By: unknown <>
… profiles to template

Co-Authored-By: unknown <>

- name: Install Elementary
run: pip install "./elementary[${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || inputs.warehouse-type }}]"
run: pip install "./elementary[${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || (inputs.warehouse-type == 'duckdb' && 'postgres') || inputs.warehouse-type }}]"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does duckdb imply postgres?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The elementary Python package doesn't have a duckdb extra. Its available extras are: postgres, snowflake, bigquery, redshift, databricks, athena, spark, trino, clickhouse.

Each extra just pulls in the corresponding dbt adapter (e.g. elementary[postgres]dbt-postgres). The dbt-duckdb adapter is already installed by the "Install dbt" step above (line 110-115), so the elementary install doesn't need to handle it.

However, elementary[postgres] pulls in dbt-postgres as an unnecessary extra dependency. We could simplify this to just elementary (no extra) since dbt-duckdb is already installed. Want me to change it to:

pip install "./elementary"

instead? Or if you'd prefer to add a duckdb extra to the elementary package, that would be the cleaner long-term fix.

{%- do elementary.edr_create_table_as(true, temp_relation, test_tables_union_query) %}
{% do elementary.run_query(insert_query) %}

{# DuckDB: commit so insert survives dbt's post-on-run-end ROLLBACK #}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking- this is a duckdb-specific requirement?
For example, why doesn't redshift suffer from this rollback?
(You don't have to change if you think it is correct, but I want a clarification)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is DuckDB-specific. Here's why Redshift (and Postgres) don't need it:

Redshift/Postgres path: default__get_delete_and_insert_queries (line 50-65) wraps the DELETE+INSERT in a single SQL string with explicit BEGIN TRANSACTION; ... COMMIT;. This COMMIT inside the SQL string persists the data before dbt's post-on-run-end ROLLBACK can touch it. The ROLLBACK becomes a no-op because there's nothing left to roll back.

DuckDB's problem: DuckDB doesn't support BEGIN TRANSACTION inside a multi-statement SQL string the same way — it errors or ignores it. So duckdb__get_delete_and_insert_queries returns separate query strings without transaction wrapping (same pattern as Redshift's redshift__get_delete_and_insert_queries which also returns separate strings — but Redshift's adapter auto-commits each statement). DuckDB's adapter runs under an implicit transaction, so dbt's post-hook ROLLBACK undoes everything.

The adapter.commit() call is the DuckDB equivalent of the COMMIT; inside the default SQL string — it's just done at the Python API level instead of inside SQL.

Note: Redshift also returns separate queries (line 123-144) without BEGIN/COMMIT wrapping, but Redshift's dbt adapter auto-commits each statement by default, so the ROLLBACK doesn't affect it. DuckDB's adapter doesn't auto-commit.

{% do adapter.drop_relation(insert_relation) %}
{% endif %}

{# DuckDB: commit cleanup drops so they survive dbt's post-on-run-end ROLLBACK too #}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again - checking as my other comment about why this is duckdb specific and not relevant to other DBs that commit like redshift

(And again, if you think this is correct please just explain)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same mechanism as my answer on handle_tests_results.sql:

The default__ dispatch (used by Postgres) wraps everything in BEGIN TRANSACTION; ... COMMIT; inside the SQL string (line 50-65), so the data is committed before dbt's ROLLBACK fires.

Redshift has its own dispatch (redshift__get_delete_and_insert_queries, line 123-144) that also returns separate queries without BEGIN/COMMIT, but Redshift's dbt adapter auto-commits each individual statement execution. So dbt's ROLLBACK is effectively a no-op on Redshift.

DuckDB's adapter doesn't auto-commit — it runs under an implicit transaction. So when dbt issues ROLLBACK after on-run-end, it undoes the DELETE+INSERT. The adapter.commit() call explicitly persists the changes before that ROLLBACK can fire.

TL;DR: Postgres uses SQL-level COMMIT, Redshift auto-commits per statement, DuckDB needs Python-level adapter.commit().

{% endmacro %}

{% macro duckdb__has_temp_table_support() %}
{% do return(false) %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really doesn't, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DuckDB does support CREATE TEMPORARY TABLE syntactically. However, in this codebase, has_temp_table_support controls whether edr_create_table_as creates temp tables or regular tables.

The issue is that DuckDB temp tables are scoped to a single connection/cursor. In the test harness, dbt run operations and subsequent dbt run-operation calls may use different cursor instances on the shared DuckDB connection. A temp table created in one cursor isn't always visible to another, causing "Table does not exist" errors.

Returning false means the code uses regular (non-temp) tables with explicit cleanup via fully_drop_relation, which is more reliable. This is the same approach used by spark, trino, athena, dremio, and clickhouse — all return false for similar visibility reasons.

{# DuckDB - truncate and insert with commit to survive ROLLBACK on in-memory databases #}
{% macro duckdb__replace_table_data(relation, rows) %}
{% call statement('truncate_relation') -%}
delete from {{ relation }} where 1=1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to verify - there's no truncate statement in duckdb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DuckDB does support TRUNCATE TABLE. The delete from {{ relation }} where 1=1 pattern was copied from the spark__replace_table_data dispatch (line 20-21) for consistency, but I can change it to TRUNCATE TABLE {{ relation }} if you prefer — it's functionally equivalent. Want me to make that change?

…p commits

DuckDB's dbt adapter reuses the same cursor (DuckDBConnectionWrapper.cursor()
returns a cached instance), so temp tables are visible across all queries within
a single dbt invocation. This means has_temp_table_support can return true.

Changes:
- duckdb__has_temp_table_support() now returns true
- duckdb__edr_get_create_table_as_sql() respects the temporary flag
- Removed redundant DuckDB commit-after-cleanup-drop blocks in handle_tests_results
  (temp tables are auto-dropped by dbt, no manual cleanup needed)

Co-Authored-By: unknown <>
@devin-ai-integration
Copy link
Contributor Author

Update on has_temp_table_support (re: comment by @haritamar):

I investigated this further and my previous answer about cursor visibility was wrong. DuckDB temp tables actually do work within a single dbt invocation because the dbt-duckdb adapter reuses the same cursor instance for all queries:

  1. LocalEnvironment.handle() creates one cursor via self.conn.cursor()
  2. Wraps it in DuckDBConnectionWrapper
  3. DuckDBConnectionWrapper.cursor() returns self._cursor — the same cached cursor every time
  4. SQLConnectionManager.add_query() calls connection.handle.cursor() which returns that cached cursor

So all queries within a dbt invocation share one cursor, and temp tables are visible across all of them. I verified this locally with both test_artifacts (14 tests) and test_schema_changes (2 tests) — all pass.

Latest commit (2af51df) changes has_temp_table_support to return true, updates create_table_as to respect the temporary flag, and removes the now-unnecessary cleanup commit blocks from handle_tests_results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant