Skip to content

feat: add Microsoft Fabric DWH integration (ELE-5282)#962

Merged
haritamar merged 39 commits intomasterfrom
devin/ELE-5282-1772640713
Mar 8, 2026
Merged

feat: add Microsoft Fabric DWH integration (ELE-5282)#962
haritamar merged 39 commits intomasterfrom
devin/ELE-5282-1772640713

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 4, 2026

feat: add Microsoft Fabric DWH integration (ELE-5282)

Summary

Adds Microsoft Fabric Data Warehouse as a supported adapter in the elementary dbt package. Fabric uses T-SQL syntax via the dbt-fabric adapter maintained by Microsoft. Since dbt-sqlserver declares dependencies=["fabric"], all fabric__ macros are auto-dispatched for both adapters (sqlserver__fabric__default__).

Macro overrides added (~25 fabric__ macros):

  • Date/time functions: datediff, date_trunc, time_trunc, to_char, timeadd, dateadd, current_timestamp
  • Data types: type_boolbit, type_stringvarchar(4096), type_timestampdatetime2, data_type_list, get_normalized_data_type, try_cast_column_to_timestamp
  • Table operations: complete_buckets_cte (recursive CTE with option(maxrecursion 10000)), escape_special_chars, create_temp_table (view-based CTE workaround), edr_make_temp_relation (handles dict model nodes)
  • Cross-DB utils: target_database, incremental_strategymerge, generate_elementary_profile_args, day_of_week, hour_of_day, hour_of_week
  • Nested CTE workarounds: query_test_result_rows, get_failed_row_count_calc_query (use temp tables since T-SQL doesn't support nested CTEs)
  • Schema changes: get_columns_snapshot_query, get_columns_changes_query (unnest CTEs, ||dbt.concat(), TEXT normalization)
  • Test materialization: Force Fabric/sqlserver targets to use temp table path to avoid CTE-in-subquery errors
  • INSERT VALUES boolean rendering: render_value now uses edr_boolean_literal() for all adapters; T-SQL renders cast(1/0 as bit) instead of bare True/False

New cross-adapter helper macros:

  • edr_concat(fields) — wraps dbt.concat() with varchar cast on Fabric (replaces old 2-arg edr_concat and edr_dbt_concat)
  • edr_condition_as_boolean(condition, default_false=true) — renders case when <cond> then <TRUE> else <FALSE> end using edr_boolean_literal(). Eliminates repeated boolean CASE patterns across codebase.
  • edr_is_true(expr) / edr_is_false(expr) — dispatched boolean comparisons (= 1/= 0 for T-SQL bit type, = true/= false for standard SQL)

Only 2 sqlserver__ macros exist (everything else auto-dispatches from fabric__):

  • sqlserver__target_database — needed by the CLI
  • sqlserver__generate_elementary_profile_args — different connection params (SQL auth vs Azure AD)

Infrastructure:

  • Docker Compose for SQL Server 2022 (local testing, target named sqlserver)
  • CI workflows updated (test-warehouse.yml, test-all-warehouses.yml)
  • ODBC Driver 18 installation in CI
  • Password Elementary123! hardcoded directly in docker-compose-sqlserver.yml and profiles.yml.j2 (matching how postgres/clickhouse/dremio handle local credentials — no env var indirection)
  • timeout-minutes: 60 on the test job to prevent hung CI runs (e.g. dead dbt-fusion workers)
  • Integration test SQL templates: LIMIT 1TOP 1 for T-SQL compatibility via SelectLimit helper class
  • Boolean literals: true/falseedr_boolean_literal() macro across alert/anomaly models
  • Positional GROUP BY → explicit column lists (T-SQL doesn't support GROUP BY 1,2,3)

Changes addressing PR review:

  • Refactored T-SQL checks to dispatched macros for better maintainability:
    • get_columns_snapshot_query: Replaced inline target.type checks with edr_boolean_literal()
    • get_columns_changes_query: Extracted to fabric__/default__get_column_changes_from_baseline_cur dispatch
    • column_monitoring_query: Extracted GROUP BY to fabric__/default__column_monitoring_group_by dispatch
    • insert_as_select: Converted to fabric__/default__insert_as_select dispatch
    • get_anomaly_query and get_anomaly_query_for_dimension_anomalies: Both now dispatched
    • schema_changes_query_group_by: New dispatched macro for GROUP BY in get_columns_changes_query_genericfabric__ variant uses explicit 6-column GROUP BY (source columns from all_column_changes) instead of positional 9-column reference
    • String concatenation: Replaced raw concat() with {{ dbt.concat([...]) }} for cross-DB compatibility (Redshift's concat() only accepts 2 args; dbt.concat() dispatches per adapter)
  • Renamed CI target fabricsqlserver: More accurate since we test against SQL Server Docker, not real Fabric. Kept all macros as fabric__ (shared T-SQL implementation).
  • Simplified DROP VIEW IF EXISTS in create_temp_table.sql (SQL Server 2016+ support).
  • Global GROUP BY refactoring: Replaced positional group by 1,2,3 with explicit column names in several models (metrics_anomaly_score.sql, monitors_runs.sql) and removed order by from dbt_artifacts_hashes.sql CTE for consistency.

Updates since last revision (latest commits)

Addressed PR review round 3 (comments 107-120):

  1. Created edr_condition_as_boolean macro — Consolidates the repeated case when <condition> then TRUE else FALSE end pattern into a single dispatched macro. Applied across 6 files: get_anomaly_scores_query.sql, get_columns_snapshot_query.sql, anomaly_threshold_sensitivity.sql, model_run_results.sql, metrics_anomaly_score.sql, get_anomaly_query.sql. The macro uses edr_boolean_literal() internally for correct T-SQL rendering.

  2. Created edr_is_true/edr_is_false macros — Dispatched boolean comparison helpers. On T-SQL they compare against 1/0 (bit type), on standard SQL they use true/false. Applied in column_boolean_monitors.sql for casting boolean columns.

  3. Renamed edr_dbt_concatedr_concat — Generalized the existing unused 2-arg edr_concat to accept a list of fields (like edr_dbt_concat did). Updated all 3 call sites: list_concat_with_separator.sql, generate_surrogate_key.sql, get_columns_changes_query.sql.

  4. Consolidated full_names.sql — Now uses edr_concat for all adapters. Removed redundant fabric__database_full_name, fabric__schema_full_name, fabric__table_full_name variants (the default implementations work since edr_concat handles T-SQL casting).

  5. Made row_count_metric_query CTE-free as default — The Fabric-specific CTE-free implementation is now default__row_count_metric_query. Removed fabric__row_count_metric_query entirely since it's identical. This change affects all adapters but is safer since it avoids nested CTEs.

  6. Extracted helper macros in table_monitoring_query.sql to reduce duplication between default__freshness_metric_query and fabric__freshness_metric_query:

    • _freshness_lag_expr() — shared SQL for computing freshness as time since last update
    • _bucket_end_freshness_expr() — dispatched (default__ uses least(), fabric__ uses CASE WHEN since T-SQL doesn't support least())
    • _freshness_final_select(metric) — shared final SELECT columns
  7. Removed fabric__get_unified_metrics_query — Merged the UNION ALL logic into default__get_unified_metrics_query. The fabric variant was identical to what the default implementation should be (CTE-free, direct UNION ALL), so all adapters now use the CTE-free approach.

  8. Optimized fabric__query_test_result_rows — Changed from fetching all rows into Python memory (result[:sample_limit]) to using temp view + SELECT TOP in SQL. This avoids memory issues for large test result sets. The view is created with a timestamp + random suffix to avoid collisions, then dropped after sampling.

Addressed GitHub comment (haritamar):

  1. Fixed sqlserver CI failures (temp view schema issue) — The fabric__query_test_result_rows temp view was initially using model.schema (per-worker test audit schema), which doesn't exist in parallel xdist execution. Changed to target.schema as a safer fallback (commit acb01a0).

  2. Fixed fabric cloud CI failures (schema doesn't exist) — On fabric cloud with parallel pytest-xdist execution, even target.schema may not exist or be accessible in per-worker contexts. Final fix uses elementary.get_package_database_and_schema() to get the elementary package schema (guaranteed to exist since it contains the package models), falling back to target.schema if the package schema lookup fails (commit a2978b5).

CI Status: ✅ All 22 checks pass (0 failures)

Review & Testing Checklist for Human

Critical (MUST verify before merge):

  • Temp view schema using elementary.get_package_database_and_schema() — This is the core fix for fabric cloud CI failures. The approach relies on the elementary package schema always existing. If the package schema lookup fails or returns an inaccessible schema, the fallback to target.schema could fail silently on fabric cloud. Verify test execution works on real Fabric workspace with parallel pytest-xdist workers.

  • edr_condition_as_boolean in anomaly detection logic — This new macro is used in metrics_anomaly_score.sql and anomaly_threshold_sensitivity.sql (core anomaly detection). The implementation uses string concatenation for complex window function conditions in model_run_results.sql. Verify anomaly detection produces identical results before/after across all adapters (especially edge cases like stationary data, nulls).

  • row_count_metric_query made CTE-free for ALL adapters — This changes the default implementation across postgres, snowflake, bigquery, etc. (not just T-SQL). The CTE-free version uses direct UNION ALL of metric queries instead of wrapping them in a CTE. Verify row count metrics produce identical results on all adapters.

  • fabric__query_test_result_rows temp view cleanup — The optimized version creates real views in the database schema (edr_test_sample_<timestamp>_<random>). If cleanup fails (e.g., query timeout, process killed), orphan views accumulate. Monitor for edr_test_sample_* views in fabric/sqlserver schema after test runs.

  • Removed fabric__get_unified_metrics_query — Per haritamar's comment, this macro was supposed to be removed but was still present. Now removed entirely, with the UNION ALL logic merged into default__get_unified_metrics_query. Verify metric collection works correctly on fabric/sqlserver (especially with multiple metric types).

Recommended:

  • Test against real Fabric workspace — All local testing is against SQL Server 2022 Docker. Fabric CI runs pass, but there may be subtle T-SQL dialect differences in real Fabric workspaces (Azure Synapse Analytics features/limitations).

  • Verify anomaly_threshold_sensitivity.sql rewrite — Git shows 78% rewrite. The file was heavily refactored using edr_condition_as_boolean for all 7 threshold checks (1.5, 2, 2.5, 3, 3.5, 4, 4.5). Spot-check anomaly threshold logic produces correct results.

  • Verify metrics_anomaly_score.sql CTE chain — Added intermediate time_window_scored CTE to fix "column reference in same SELECT" error. Verify anomaly score calculation is correct and the CTE refactoring doesn't change semantics.

Test Plan

  1. CI validation: ✅ Complete - All 22 checks pass

    • All core targets pass: postgres, snowflake, bigquery, redshift, athena, databricks, duckdb, clickhouse, trino, spark, dremio, sqlserver, fabric
  2. Local Docker test (optional):

    cd integration_tests
    docker compose -f docker-compose-sqlserver.yml up -d
    pytest tests -vvv --target sqlserver -n4
  3. Fabric cloud verification (recommended):

    • Run full test suite on real Fabric workspace to verify temp view schema fix works in production
    • Monitor for orphan edr_test_sample_* views after test runs

Notes

  • Linear ticket: https://linear.app/elementary/issue/ELE-5282/add-microsoft-fabric-dwh-integration-to-dbt-data-reliability
  • Session: https://app.devin.ai/sessions/a031fe8e0a0440f88e406da576a7bccb
  • Requested by: @haritamar
  • The password Elementary123! is hardcoded in docker-compose-sqlserver.yml and profiles.yml.j2 for local dev/CI only (no secrets needed)
  • Target renamed fabricsqlserver: Reflects that CI tests against SQL Server Docker, not Fabric. Macros stay as fabric__ (shared T-SQL).
  • Only 2 sqlserver__ macros exist: target_database (CLI), generate_elementary_profile_args (connection params). create_temp_table was removed since dispatch chain handles it automatically.
  • Temp view schema evolution: Initially used model.schema → failed on per-worker test audit schemas → switched to target.schema → failed on fabric cloud where target.schema doesn't exist → final fix uses elementary.get_package_database_and_schema() (guaranteed to exist).

Summary by CodeRabbit

  • New Features

    • Microsoft Fabric/SQL Server added as a supported warehouse option (testing via sqlserver target).
    • New cross-adapter helper macros: edr_concat, edr_condition_as_boolean, edr_is_true, edr_is_false for cleaner boolean/string handling.
  • Compatibility

    • Expanded Fabric/SQL Server support for data types, date/time functions, string/char formatting, time math, and incremental behavior.
    • Improved test/result query paths and boolean rendering for cross-dialect correctness.
    • CTE-in-subquery workaround for T-SQL via view-based temp table creation.
    • Dispatched macros for cleaner T-SQL handling (get_column_changes_from_baseline_cur, column_monitoring_group_by, insert_as_select, get_anomaly_query, schema_changes_query_group_by).
    • Dispatched render_value macro with unified boolean literal handling via edr_boolean_literal().
    • Global refactoring: positional GROUP BY replaced with explicit column names in anomaly/run models.
    • Cross-DB string concatenation via dbt.concat() instead of raw concat() (fixes Redshift compatibility).
    • Renamed edr_dbt_concatedr_concat (generalized to accept list of fields).
    • Cross-adapter LIMIT/TOP helper (SelectLimit class) for test infrastructure.
    • CI timeout protection: 60-minute job timeout to catch hung runs early.
    • Hardcoded local credentials directly (no env var indirection), matching postgres/clickhouse/dremio pattern.
  • Bug Fixes

    • Fixed metrics_anomaly_score column reference error by introducing intermediate time_window_scored CTE.
    • Removed undefined _tsql_create_temp_table helper by inlining logic into fabric__create_temp_table.
    • Fixed fabric cloud temp view schema issue by using elementary.get_package_database_and_schema() (guaranteed to exist in all contexts).
    • Removed fabric__get_unified_metrics_query per reviewer feedback - merged UNION ALL logic into default implementation.
  • Refactoring

    • Consolidated boolean CASE patterns using edr_condition_as_boolean across 6+ files.
    • Reduced duplication in table_monitoring_query.sql freshness queries via helper macros (_freshness_lag_expr, _bucket_end_freshness_expr, _freshness_final_select).
    • Made row_count_metric_query CTE-free as default (affects all adapters).
    • Removed redundant fabric__ variants in full_names.sql and get_unified_metrics_query.
    • Optimized fabric__query_test_result_rows to use temp view + TOP instead of fetching all rows.
  • Tests

    • CI and integration tests updated to exercise SQL Server scenarios and local service startup.
    • 187 integration tests passing in CI against SQL Server 2022.
    • All 22 CI checks pass on fabric cloud.
  • Chores

    • Target renamed from fabric to sqlserver for accuracy.
    • Removed redundant sqlserver__create_temp_table (auto-dispatched via dependencies=["fabric"]).

Add support for Microsoft Fabric Data Warehouse as a new adapter in the
elementary dbt package. Fabric uses T-SQL syntax and is supported via the
dbt-fabric adapter maintained by Microsoft.

Changes:
- Add fabric__ macro overrides for date/time functions (datediff, date_trunc,
  time_trunc, to_char, timeadd, dateadd, current_timestamp)
- Add fabric__ data type overrides (data_type_list, type_timestamp, type_string,
  type_bool, get_normalized_data_type, try_cast_column_to_timestamp)
- Add fabric__ table operations (complete_buckets_cte with recursive CTE,
  escape_special_chars)
- Add fabric__ cross-DB utilities (target_database, incremental_strategy,
  generate_elementary_profile_args, day_of_week, hour_of_day, hour_of_week)
- Add fabric__ nested CTE workarounds (query_test_result_rows,
  get_failed_row_count_calc_query) using temp tables to avoid nested CTEs
  and TOP instead of LIMIT
- Add docker-compose-sqlserver.yml for local testing with env var credentials
- Add fabric target to profiles.yml.j2
- Update CI workflows (test-warehouse.yml, test-all-warehouses.yml) with
  SQL Server Docker setup, ODBC driver installation, and fabric target

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@linear
Copy link

linear bot commented Mar 4, 2026

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

👋 @devin-ai-integration[bot]
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Fabric (SQL Server) support: CI/workflow inputs and secrets, Dockerized SQL Server test service and healthchecks, profile/template env helper, and extensive Fabric-specific macros and T-SQL-compatible materialization paths across EDR, utils, and data-type macros.

Changes

Cohort / File(s) Summary
CI / Workflows
.github/workflows/test-all-warehouses.yml, .github/workflows/test-warehouse.yml
Adds fabric option to matrices/inputs and workflow_dispatch, introduces MSSQL_SA_PASSWORD env, starts SQL Server in fabric runs, installs ODBC driver, and maps dbt/Elementary install extras to include fabric (sqlserver).
Test Infra
integration_tests/docker-compose-sqlserver.yml
New Docker Compose service for Microsoft SQL Server 2022 with port binding, env vars, and a sqlcmd-based healthcheck.
Profiles & Rendering
integration_tests/profiles/profiles.yml.j2, integration_tests/profiles/generate_profiles.py
Adds a fabric local target in the profiles template and a Jinja2 global env_var(key, default) for injecting env vars (e.g., MSSQL_SA_PASSWORD) when generating profiles.
Materializations / Test helpers
macros/edr/materializations/test/failed_row_count.sql, macros/edr/materializations/test/test.sql
Delegates failed-row-count and test-row queries via adapter.dispatch and adds fabric__ implementations that use temp relations/tables to work around Fabric/T-SQL limitations.
EDR System Utilities
macros/edr/system/system_utils/...
buckets_cte.sql, empty_table.sql, get_config_var.sql
Adds fabric__complete_buckets_cte, Fabric-specific dummy_values, boolean-rendering helpers (render_bool_config_var, edr_boolean_literal, fabric__edr_boolean_literal).
Cross-DB date/time & formatting
macros/utils/cross_db_utils/...
current_timestamp.sql, date_trunc.sql, dateadd.sql, datediff.sql, day_of_week.sql, hour_of_day.sql, hour_of_week.sql, time_trunc.sql, timeadd.sql, to_char.sql
Introduces many fabric__ variants implementing T‑SQL/Fabric expressions (getdate/sysutcdatetime, datetrunc/dateadd/datediff, datename/datepart, time trunc/add, to_char formatting, hour/day/week helpers).
Profile & strategy helpers
macros/utils/cross_db_utils/generate_elementary_profile_args.sql, macros/utils/cross_db_utils/incremental_strategy.sql, macros/utils/cross_db_utils/target_database.sql
Adds fabric__generate_elementary_profile_args (supports cli/service-principal/user flows), fabric__get_default_incremental_strategy (returns "merge"), and fabric__target_database.
Data types & normalization
macros/utils/data_types/...
data_type.sql, data_type_list.sql, get_normalized_data_type.sql, try_cast_column_to_timestamp.sql
Adds fabric__ type mappings (bit, varchar(4096), datetime2), data type lists and normalization mappings, and a Fabric try-cast-to-timestamp helper. Note: duplicate macro insertions observed in some files.
Table operations
macros/utils/table_operations/insert_rows.sql
Adds fabric__escape_special_chars (doubles single quotes). Duplicate insertion locations observed.
Data monitoring & numeric ops
macros/edr/data_monitoring/monitors/column_numeric_monitors.sql
Adds fabric__standard_deviation and fabric__variance using T‑SQL STDEV/VAR patterns.
Model / SQL template changes
models/edr/...
alerts/*, anomaly_detection/*, run_results/*, dbt_artifacts_hashes.sql, system/monitors_runs.sql
Replaces raw boolean literals with edr_boolean_literal/render_bool_config_var in many CASE/WHERE expressions; converts some GROUP BY/ORDER BY usages (explicit grouping, removed ordering clause).
Potential Attention Areas
macros/..., integration_tests/..., .github/...
Duplicate macro insertions (e.g., datediff, data_type_list, generate_elementary_profile_args, escape_special_chars) and Fabric/T‑SQL patterns (temp tables, maxrecursion) need SQL validation; CI secret/env interplay (MSSQL_SA_PASSWORD) and healthcheck timing should be reviewed.

Sequence Diagram(s)

sequenceDiagram
  participant User as User
  participant CI as CI Workflow
  participant Runner as Actions Runner
  participant Docker as Docker (sqlserver)
  participant Profiles as Profiles Renderer
  participant DBT as dbt/Elementary

  User->>CI: dispatch workflow (warehouse-type=fabric)
  CI->>Runner: start job
  Runner->>Docker: docker-compose up (sqlserver)
  Docker-->>Runner: healthcheck OK
  Runner->>Runner: install msodbcsql18, dbt + extras (fabric/sqlserver)
  Runner->>Profiles: render profiles (env_var, pass MSSQL_SA_PASSWORD)
  Profiles-->>Runner: profiles.yml
  Runner->>DBT: run dbt with fabric profile
  DBT->>Docker: connect via ODBC/T-SQL, execute `fabric__*` macros & temp-table flows
  DBT-->>Runner: tests/results
  Runner-->>CI: report status
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hopped in, nibbling schema seeds,

Temp tables sprouted to handle the deeds,
getdate and sysutc sang the hour,
Buckets and casts bent to Fabric's power,
Now tests can hop through SQL Server meads.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding Microsoft Fabric DWH integration to the elementary dbt package.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/ELE-5282-1772640713

Comment @coderabbitai help to get the list of available commands and usage tips.

devin-ai-integration bot and others added 2 commits March 4, 2026 16:23
The generate_profiles.py uses Python Jinja2 (not dbt Jinja2), so the
env_var() function used in profiles.yml.j2 for the fabric password was
undefined. Add env_var as a Jinja2 global that reads from os.environ.

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…os to sqlserver__

- Changed profile type from 'fabric' to 'sqlserver' for SQL auth support
- Renamed all fabric__ macro overrides to sqlserver__ (dbt-fabric dispatches
  to sqlserver as fallback, so these work with both adapters)
- Updated CI to install dbt-sqlserver instead of dbt-fabric
- Added encrypt: false to profile for local SQL Server connection

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (2)
macros/edr/materializations/test/test.sql (1)

205-212: Style consistency: Consider using elementary.run_query throughout.

Lines 206 and 212 use run_query, while line 211 uses elementary.run_query. Since the codebase consistently uses elementary.run_query in other table operations, standardize to that form for maintainability. Note: Both ultimately call the same dbt.run_query and maintain the same session context, so temp table visibility is not a concern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/edr/materializations/test/test.sql` around lines 205 - 212, Replace
the two raw run_query calls with the consistent wrapper elementary.run_query:
change the calls that reference creating and dropping the temp table (the
run_query invocations surrounding tmp_relation) to use elementary.run_query
instead, keeping the same SQL strings; ensure the rest of the block that builds
query (the {% set query %} ... select ... from {{ tmp_relation }} and the usage
of elementary.agate_to_dicts(elementary.run_query(query))) remains unchanged so
behavior and session context are preserved.
macros/utils/cross_db_utils/generate_elementary_profile_args.sql (1)

270-270: Consider multi-line formatting for consistency.

The cli method handling is condensed to a single line while other branches use multi-line formatting. This is a minor readability nit.

♻️ Optional: Multi-line formatting
-    {% if method == "cli" %} {% do parameters.append(_parameter("method", "cli")) %}
+    {% if method == "cli" %}
+        {% do parameters.append(_parameter("method", "cli")) %}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/cross_db_utils/generate_elementary_profile_args.sql` at line
270, The single-line Jinja branch for method == "cli" should be rewritten to
match the multi-line style used elsewhere: expand the conditional into separate
lines so the {% if method == "cli" %} is on its own line, the {% do
parameters.append(_parameter("method", "cli")) %} is on the next line, and close
with {% endif %}; this keeps the handling of method, parameters.append and
_parameter("cli") consistent with other branches for readability.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration_tests/profiles/profiles.yml.j2`:
- Line 73: The Jinja expression requiring MSSQL_SA_PASSWORD causes rendering to
fail when the env var is missing; update the password template (the line
containing password: "{{ env_var('MSSQL_SA_PASSWORD') }}" ) to provide a safe
fallback by using env_var with a default value or Jinja's default filter (e.g.,
env_var('MSSQL_SA_PASSWORD', 'fallback') or env_var('MSSQL_SA_PASSWORD') |
default('fallback')) so local/profile rendering succeeds when the environment
variable is not set.

In `@macros/edr/materializations/test/failed_row_count.sql`:
- Around line 62-68: The macro fabric__get_failed_row_count_calc_query currently
calls a non-existent helper elementary.make_temp_relation(model) — replace that
call with the correct temp-relation helper (choose one of
edr_make_temp_relation, make_temp_table_relation, or make_temp_view_relation
depending on whether you need a temp table vs view) and update the tmp_relation
assignment accordingly; also ensure the temp table is cleaned up by adding `{%
do run_query("drop table if exists " ~ tmp_relation) %}` after the final SELECT
to prevent orphaned tables (refer to the tmp_relation variable and the existing
run_query("select * into " ~ tmp_relation ~ ...) usage to place the cleanup).

In `@macros/edr/system/system_utils/buckets_cte.sql`:
- Around line 241-247: The recursive CTE "timestamps" (using
min_bucket_start_expr, bucket_end_expr and max_bucket_end_expr) and the
SELECT-list alias next_bucket in the WHERE clause are incompatible with
Microsoft Fabric Warehouse; replace the recursive generation with a
Fabric-compatible approach (e.g., build bucket rows by CROSS JOINing against a
numbers/sequence source or a VALUES list and compute edr_bucket_start from
min_bucket_start_expr + n * bucket_interval) and remove use of the SELECT alias
in predicates by repeating the expression or computing it in a derived table, or
alternatively mark this macro as SQL Server–only and remove the fabric__ prefix
so it is not run on Fabric. Ensure the code references the same symbols
(min_bucket_start_expr, bucket_end_expr, max_bucket_end_expr,
edr_bucket_start/next_bucket) so callers still work.
- Line 253: The recursive CTE currently ends with "option (maxrecursion 10000)"
which can overflow for long ranges; update the recursion option to "MAXRECURSION
0" to allow unlimited recursion and rely on the existing termination condition
"where next_bucket < {{ max_bucket_end_expr }}" to stop recursion; edit the
buckets CTE in buckets_cte.sql (the clause containing option (maxrecursion
10000) and the recursive column next_bucket) to replace 10000 with 0 so large
1-minute bucket ranges do not error.

In `@macros/utils/cross_db_utils/incremental_strategy.sql`:
- Around line 23-25: The macro fabric__get_default_incremental_strategy
currently returns the hardcoded string "merge"; change it to follow the shared
strategy contract by returning "none" for non-Athena adapters and delegating to
the Athena-specific helper for Athena: update
fabric__get_default_incremental_strategy to detect the adapter type (or call the
existing get_append_only_incremental_strategy() for Athena) and return "none"
otherwise so adapter defaults are preserved.

In `@macros/utils/cross_db_utils/to_char.sql`:
- Around line 27-32: The fabric__edr_to_char macro currently injects the format
unquoted into CONVERT, which breaks when callers pass pattern-style strings;
update the macro (fabric__edr_to_char) to emit the format argument quoted (e.g.,
wrap {{ format }} in single quotes) when present while keeping the existing
numeric fallback 120 unchanged, so CONVERT(varchar, {{ column }}, '{{ format
}}') is produced for custom formats and CONVERT(varchar, {{ column }}, 120)
remains the fallback.

---

Nitpick comments:
In `@macros/edr/materializations/test/test.sql`:
- Around line 205-212: Replace the two raw run_query calls with the consistent
wrapper elementary.run_query: change the calls that reference creating and
dropping the temp table (the run_query invocations surrounding tmp_relation) to
use elementary.run_query instead, keeping the same SQL strings; ensure the rest
of the block that builds query (the {% set query %} ... select ... from {{
tmp_relation }} and the usage of
elementary.agate_to_dicts(elementary.run_query(query))) remains unchanged so
behavior and session context are preserved.

In `@macros/utils/cross_db_utils/generate_elementary_profile_args.sql`:
- Line 270: The single-line Jinja branch for method == "cli" should be rewritten
to match the multi-line style used elsewhere: expand the conditional into
separate lines so the {% if method == "cli" %} is on its own line, the {% do
parameters.append(_parameter("method", "cli")) %} is on the next line, and close
with {% endif %}; this keeps the handling of method, parameters.append and
_parameter("cli") consistent with other branches for readability.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 98b8c87c-9b9b-45a1-8964-0c755b95550d

📥 Commits

Reviewing files that changed from the base of the PR and between d88dc70 and 2c08c87.

📒 Files selected for processing (25)
  • .github/workflows/test-all-warehouses.yml
  • .github/workflows/test-warehouse.yml
  • integration_tests/docker-compose-sqlserver.yml
  • integration_tests/profiles/profiles.yml.j2
  • macros/edr/materializations/test/failed_row_count.sql
  • macros/edr/materializations/test/test.sql
  • macros/edr/system/system_utils/buckets_cte.sql
  • macros/utils/cross_db_utils/current_timestamp.sql
  • macros/utils/cross_db_utils/date_trunc.sql
  • macros/utils/cross_db_utils/dateadd.sql
  • macros/utils/cross_db_utils/datediff.sql
  • macros/utils/cross_db_utils/day_of_week.sql
  • macros/utils/cross_db_utils/generate_elementary_profile_args.sql
  • macros/utils/cross_db_utils/hour_of_day.sql
  • macros/utils/cross_db_utils/hour_of_week.sql
  • macros/utils/cross_db_utils/incremental_strategy.sql
  • macros/utils/cross_db_utils/target_database.sql
  • macros/utils/cross_db_utils/time_trunc.sql
  • macros/utils/cross_db_utils/timeadd.sql
  • macros/utils/cross_db_utils/to_char.sql
  • macros/utils/data_types/data_type.sql
  • macros/utils/data_types/data_type_list.sql
  • macros/utils/data_types/get_normalized_data_type.sql
  • macros/utils/data_types/try_cast_column_to_timestamp.sql
  • macros/utils/table_operations/insert_rows.sql

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (3)
macros/utils/cross_db_utils/incremental_strategy.sql (1)

23-25: ⚠️ Potential issue | 🟠 Major

Avoid hardcoding merge for SQL Server default strategy.

At Line 24, returning "merge" reintroduces the same behavior drift previously flagged; non-Athena adapters should return none so adapter defaults apply.

Suggested fix
 {%- macro sqlserver__get_default_incremental_strategy() %}
-    {% do return("merge") %}
+    {% do return(none) %}
 {% endmacro %}

Based on learnings: “incremental_strategy should return none for all adapters except Athena, where get_append_only_incremental_strategy() is used.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/cross_db_utils/incremental_strategy.sql` around lines 23 - 25,
The macro sqlserver__get_default_incremental_strategy currently returns the
hardcoded string "merge"; change it to return "none" so SQL Server defers to the
adapter's default incremental strategy (consistent with the rule that only
Athena uses get_append_only_incremental_strategy); update the return in
sqlserver__get_default_incremental_strategy to "none" and ensure no other
hardcoded "merge" defaults are introduced for non-Athena adapters.
macros/edr/system/system_utils/buckets_cte.sql (1)

244-246: ⚠️ Potential issue | 🔴 Critical

Fix invalid alias usage in the recursive predicate.

Line 246 references next_bucket in WHERE, but in T-SQL a SELECT-list alias is not visible in the same query block’s WHERE. This can fail at runtime/compile time.

Suggested fix
-          select {{ bucket_end_expr }} as next_bucket
+          select {{ bucket_end_expr }} as edr_bucket_start
           from timestamps
-          where next_bucket < {{ max_bucket_end_expr }}
+          where {{ bucket_end_expr }} < {{ max_bucket_end_expr }}
In SQL Server T-SQL, can a SELECT-list alias be referenced in the WHERE clause of the same SELECT query block?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/edr/system/system_utils/buckets_cte.sql` around lines 244 - 246, The
WHERE is incorrectly referencing the SELECT-list alias next_bucket (produced by
{{ bucket_end_expr }}) which is not visible in the same query block; update the
predicate to use the actual expression instead of the alias (i.e., replace
next_bucket in the WHERE with the same {{ bucket_end_expr }} expression or wrap
the SELECT as a subquery/CTE so the alias becomes available), targeting the
query that selects from timestamps and uses {{ max_bucket_end_expr }} to enforce
the upper bound.
macros/edr/materializations/test/failed_row_count.sql (1)

62-73: ⚠️ Potential issue | 🔴 Critical

Missing temp table cleanup will leak orphaned tables.

Unlike test.sql (line 214), this macro creates a temp table but never drops it. Add cleanup after the SELECT statement.

Additionally, the elementary.make_temp_relation(model) call was flagged in a previous review as potentially undefined.

🐛 Proposed fix to add cleanup
 {% macro sqlserver__get_failed_row_count_calc_query(failed_row_count_calc) %}
     {# Fabric / T-SQL does not support nested CTEs.
        We create a temp table from the test SQL, then select from it. #}
     {% set tmp_relation = elementary.make_temp_relation(model) %}
     {% do run_query(
         "select * into " ~ tmp_relation ~ " from (" ~ sql ~ ") as __edr_inner"
     ) %}
+    {% set query %}
     select
         {{ failed_row_count_calc }}
         as {{ elementary.escape_reserved_keywords("count") }}
     from {{ tmp_relation }}
+    {% endset %}
+    {% set result = elementary.run_query(query) %}
+    {% do run_query("drop table if exists " ~ tmp_relation) %}
+    {{ query }}
 {% endmacro %}

Note: The fix above is illustrative. The macro returns raw SQL (not a result set), so the actual fix depends on how the caller executes it. If the SQL is executed elsewhere, consider documenting that cleanup is the caller's responsibility, or restructure to handle cleanup internally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/edr/materializations/test/failed_row_count.sql` around lines 62 - 73,
The macro sqlserver__get_failed_row_count_calc_query creates a temp table
(tmp_relation via elementary.make_temp_relation(model)) but never drops it,
which will leak tables; after the SELECT from {{ tmp_relation }} add a DROP
TABLE (or DROP TABLE IF EXISTS / appropriate T-SQL-safe drop) for {{
tmp_relation }} to ensure cleanup, and either ensure model is defined in scope
or change the macro signature to accept model (e.g., add model as a parameter)
so elementary.make_temp_relation(model) is valid; if you prefer caller-managed
cleanup, document that behavior instead.
🧹 Nitpick comments (2)
macros/utils/cross_db_utils/dateadd.sql (1)

19-29: Comment block is misplaced relative to the new macro.

The docstring at lines 19-26 documents the Dremio override's bugs, but it now appears directly above sqlserver__edr_dateadd, which could mislead readers into thinking the comment applies to the SQL Server macro.

Consider adding a blank line and brief comment for the sqlserver macro, or relocating the Dremio docstring to sit immediately above dremio__edr_dateadd.

📝 Suggested reordering
-{#
-  Override dbt-dremio's dateadd macro which has two bugs:
-  1. Calls interval.replace() on the interval parameter, failing when interval is an integer
-  2. Wraps result in "select TIMESTAMPADD(...)" which creates a scalar subquery when
-     embedded in larger SQL expressions, causing $SCALAR_QUERY errors in Dremio
-
-  This override outputs just TIMESTAMPADD(...) as an expression (no "select" prefix).
-#}
+{# SQL Server / Fabric: use native DATEADD function #}
 {% macro sqlserver__edr_dateadd(datepart, interval, from_date_or_timestamp) %}
     dateadd({{ datepart }}, {{ interval }}, {{ from_date_or_timestamp }})
 {% endmacro %}

+{#
+  Override dbt-dremio's dateadd macro which has two bugs:
+  1. Calls interval.replace() on the interval parameter, failing when interval is an integer
+  2. Wraps result in "select TIMESTAMPADD(...)" which creates a scalar subquery when
+     embedded in larger SQL expressions, causing $SCALAR_QUERY errors in Dremio
+
+  This override outputs just TIMESTAMPADD(...) as an expression (no "select" prefix).
+#}
 {% macro dremio__edr_dateadd(datepart, interval, from_date_or_timestamp) %}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/cross_db_utils/dateadd.sql` around lines 19 - 29, The
documentation block describing the Dremio override is placed above the
sqlserver__edr_dateadd macro and may mislead readers; move the existing
docstring so it immediately precedes the dremio__edr_dateadd macro, and add a
short one-line comment or a blank line above sqlserver__edr_dateadd to indicate
it is a separate SQL Server-specific implementation. Ensure the docstring text
remains unchanged and that references to the two macros (dremio__edr_dateadd and
sqlserver__edr_dateadd) are clearly separated in the file.
.github/workflows/test-warehouse.yml (1)

116-122: Consider pinning the ODBC driver version for CI reproducibility.

The installation pulls the latest msodbcsql18 package, which could introduce unexpected behavior if Microsoft releases a breaking change. For more deterministic CI builds, consider pinning to a specific version.

♻️ Optional: Pin ODBC driver version
-          sudo ACCEPT_EULA=Y apt-get install -y msodbcsql18 unixodbc-dev
+          sudo ACCEPT_EULA=Y apt-get install -y msodbcsql18=18.4.1.1-1 unixodbc-dev

Note: Check the latest stable version and update accordingly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-warehouse.yml around lines 116 - 122, The workflow
step installs msodbcsql18 without pinning, risking CI breakage; update the
Install ODBC Driver run block to install a specific msodbcsql18 version by
replacing the unpinned package name (msodbcsql18) with a pinned package spec
(e.g., msodbcsql18=<VERSION>) or by introducing a variable/parameter for the
ODBC driver version and using that in the apt install command; ensure
unixodbc-dev remains as-is and document/update the chosen <VERSION> to a known
stable release.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/test-warehouse.yml:
- Line 55: Replace the hardcoded fallback value 'Elementary123!' used for
MSSQL_SA_PASSWORD in the workflow with a more generic/weak local-only default
(e.g., 'Password1!') or remove the inline fallback and fail the job when the
secret is missing; update the MSSQL_SA_PASSWORD assignment so it either uses a
documented local-only default or requires the secret to be set, and add a short
comment clarifying that any default is for ephemeral/local CI only.

In `@integration_tests/profiles/generate_profiles.py`:
- Line 126: The current env.globals["env_var"] lambda silently returns "" when
no default is provided; replace it with a real function (e.g., env_var) that
distinguishes between an omitted default and an explicit default: if a default
is supplied, return os.environ.get(key, default); if no default was supplied and
the environment variable is missing, raise an error (e.g., KeyError or a clear
RuntimeError) so templates like env_var('MSSQL_SA_PASSWORD') fail fast in strict
mode; update the reference to env.globals["env_var"] to point to this new
function.

In `@macros/edr/materializations/test/test.sql`:
- Around line 191-216: The call to elementary.make_temp_relation in the
sqlserver__query_test_result_rows macro is incorrect and should use the renamed
helper elementary.edr_make_temp_relation; update the tmp_relation assignment to
call elementary.edr_make_temp_relation(model) instead of
elementary.make_temp_relation(model) so tmp_relation, the temp-table creation
(run_query "select * into " ~ tmp_relation ~ ...), and the subsequent DROP use
the correct helper name.

In `@macros/utils/cross_db_utils/generate_elementary_profile_args.sql`:
- Around line 270-279: The template currently treats any non-matching method
(including typos) as username/password; update the logic in
generate_elementary_profile_args.sql so that you handle three cases explicitly:
when method is none (append no auth params or a clear placeholder like
_parameter("method", "<NONE>")), when method == "cli" or "service-principal"
(keep the existing branches that append parameters using _parameter and
parameters.append), and a new else branch for unsupported methods that appends a
clear placeholder/instruction parameter (e.g., _parameter("method",
"<UNSUPPORTED: specify 'cli' or 'service-principal' or 'none'>")) rather than
silently inserting user/password; reference the symbols method,
parameters.append, and _parameter to locate and change the conditional branches.

In `@macros/utils/cross_db_utils/hour_of_week.sql`:
- Around line 83-84: Replace the locale-dependent datename(weekday, {{ date_expr
}}) with a locale-stable day calculation: compute weekday as ((DATEDIFF(day,
'1900-01-01', CAST({{ date_expr }} AS date)) % 7) + 1) and cast that to {{
elementary.edr_type_string() }} (keep the existing cast for the hour using
datepart(hour, {{ date_expr }})); ensure you CAST({{ date_expr }} AS date)
before DATEDIFF to avoid time-boundary shifts and update the expression that
currently uses datename(weekday, {{ date_expr }}) to this deterministic formula.

In `@macros/utils/data_types/get_normalized_data_type.sql`:
- Around line 161-167: The mapping in the get_normalized_data_type macro
currently normalizes distinct SQL Server types (e.g., "DATETIME",
"SMALLDATETIME", "MONEY", "SMALLMONEY") to other canonical types, which hides
real schema differences; remove the mapping entries for "DATETIME",
"SMALLDATETIME", "MONEY", and "SMALLMONEY" from the normalization dictionary so
they remain their native types during comparisons, but keep true SQL Server
synonym mappings such as "NVARCHAR", "NCHAR", "NTEXT", "ROWVERSION", "DOUBLE
PRECISION", and "REAL" in the same mapping structure used by the
get_normalized_data_type macro.

In `@macros/utils/data_types/try_cast_column_to_timestamp.sql`:
- Around line 32-39: The WHERE clause in try_cast_column_to_timestamp.sql can
return a TOP 1 row that fails try_cast() because try_cast returns NULL on
failure; update the query that sets "query" (which uses {{ timestamp_column }},
{{ table_relation }}, and {{ elementary.edr_type_timestamp() }}) to add "and
try_cast({{ timestamp_column }} as {{ elementary.edr_type_timestamp() }}) is not
null" so only successfully castable rows are considered before calling
elementary.result_value(); keep the rest of the logic (result handling with
result_value() and return(true)/return(false)) unchanged.

---

Duplicate comments:
In `@macros/edr/materializations/test/failed_row_count.sql`:
- Around line 62-73: The macro sqlserver__get_failed_row_count_calc_query
creates a temp table (tmp_relation via elementary.make_temp_relation(model)) but
never drops it, which will leak tables; after the SELECT from {{ tmp_relation }}
add a DROP TABLE (or DROP TABLE IF EXISTS / appropriate T-SQL-safe drop) for {{
tmp_relation }} to ensure cleanup, and either ensure model is defined in scope
or change the macro signature to accept model (e.g., add model as a parameter)
so elementary.make_temp_relation(model) is valid; if you prefer caller-managed
cleanup, document that behavior instead.

In `@macros/edr/system/system_utils/buckets_cte.sql`:
- Around line 244-246: The WHERE is incorrectly referencing the SELECT-list
alias next_bucket (produced by {{ bucket_end_expr }}) which is not visible in
the same query block; update the predicate to use the actual expression instead
of the alias (i.e., replace next_bucket in the WHERE with the same {{
bucket_end_expr }} expression or wrap the SELECT as a subquery/CTE so the alias
becomes available), targeting the query that selects from timestamps and uses {{
max_bucket_end_expr }} to enforce the upper bound.

In `@macros/utils/cross_db_utils/incremental_strategy.sql`:
- Around line 23-25: The macro sqlserver__get_default_incremental_strategy
currently returns the hardcoded string "merge"; change it to return "none" so
SQL Server defers to the adapter's default incremental strategy (consistent with
the rule that only Athena uses get_append_only_incremental_strategy); update the
return in sqlserver__get_default_incremental_strategy to "none" and ensure no
other hardcoded "merge" defaults are introduced for non-Athena adapters.

---

Nitpick comments:
In @.github/workflows/test-warehouse.yml:
- Around line 116-122: The workflow step installs msodbcsql18 without pinning,
risking CI breakage; update the Install ODBC Driver run block to install a
specific msodbcsql18 version by replacing the unpinned package name
(msodbcsql18) with a pinned package spec (e.g., msodbcsql18=<VERSION>) or by
introducing a variable/parameter for the ODBC driver version and using that in
the apt install command; ensure unixodbc-dev remains as-is and document/update
the chosen <VERSION> to a known stable release.

In `@macros/utils/cross_db_utils/dateadd.sql`:
- Around line 19-29: The documentation block describing the Dremio override is
placed above the sqlserver__edr_dateadd macro and may mislead readers; move the
existing docstring so it immediately precedes the dremio__edr_dateadd macro, and
add a short one-line comment or a blank line above sqlserver__edr_dateadd to
indicate it is a separate SQL Server-specific implementation. Ensure the
docstring text remains unchanged and that references to the two macros
(dremio__edr_dateadd and sqlserver__edr_dateadd) are clearly separated in the
file.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7d8ad01e-d735-4aa7-8a79-b1c603c9bb7c

📥 Commits

Reviewing files that changed from the base of the PR and between 2c08c87 and a0a7285.

📒 Files selected for processing (24)
  • .github/workflows/test-warehouse.yml
  • integration_tests/profiles/generate_profiles.py
  • integration_tests/profiles/profiles.yml.j2
  • macros/edr/materializations/test/failed_row_count.sql
  • macros/edr/materializations/test/test.sql
  • macros/edr/system/system_utils/buckets_cte.sql
  • macros/utils/cross_db_utils/current_timestamp.sql
  • macros/utils/cross_db_utils/date_trunc.sql
  • macros/utils/cross_db_utils/dateadd.sql
  • macros/utils/cross_db_utils/datediff.sql
  • macros/utils/cross_db_utils/day_of_week.sql
  • macros/utils/cross_db_utils/generate_elementary_profile_args.sql
  • macros/utils/cross_db_utils/hour_of_day.sql
  • macros/utils/cross_db_utils/hour_of_week.sql
  • macros/utils/cross_db_utils/incremental_strategy.sql
  • macros/utils/cross_db_utils/target_database.sql
  • macros/utils/cross_db_utils/time_trunc.sql
  • macros/utils/cross_db_utils/timeadd.sql
  • macros/utils/cross_db_utils/to_char.sql
  • macros/utils/data_types/data_type.sql
  • macros/utils/data_types/data_type_list.sql
  • macros/utils/data_types/get_normalized_data_type.sql
  • macros/utils/data_types/try_cast_column_to_timestamp.sql
  • macros/utils/table_operations/insert_rows.sql
🚧 Files skipped from review as they are similar to previous changes (7)
  • macros/utils/cross_db_utils/day_of_week.sql
  • integration_tests/profiles/profiles.yml.j2
  • macros/utils/cross_db_utils/target_database.sql
  • macros/utils/cross_db_utils/date_trunc.sql
  • macros/utils/data_types/data_type.sql
  • macros/utils/cross_db_utils/to_char.sql
  • macros/utils/data_types/data_type_list.sql

keep_trailing_newline=True,
)
env.filters["toyaml"] = _yaml_inline
env.globals["env_var"] = lambda key, default="": os.environ.get(key, default)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the generate_profiles.py file around line 126
wc -l integration_tests/profiles/generate_profiles.py

Repository: elementary-data/dbt-data-reliability

Length of output: 131


🏁 Script executed:

# Read the file with context around line 126
sed -n '110,140p' integration_tests/profiles/generate_profiles.py | cat -n

Repository: elementary-data/dbt-data-reliability

Length of output: 1193


🏁 Script executed:

# Look for usage of env_var in the Jinja templates
find integration_tests -name "*.j2" -o -name "*.yml.j2" | head -20

Repository: elementary-data/dbt-data-reliability

Length of output: 122


🏁 Script executed:

# Check for env_var usage patterns in templates
rg "env_var\(" integration_tests/profiles/ -C 2

Repository: elementary-data/dbt-data-reliability

Length of output: 438


🏁 Script executed:

# Get more context around line 117 in the file
sed -n '100,134p' integration_tests/profiles/generate_profiles.py | cat -n

Repository: elementary-data/dbt-data-reliability

Length of output: 1611


🌐 Web query:

dbt env_var Jinja function behavior when variable not set no default

💡 Result:

In dbt, env_var() requires the environment variable to be set unless you pass a default:

  • {{ env_var('MY_VAR') }} (no default) → if MY_VAR is not set, dbt errors with: Env var required but not provided: 'MY_VAR' (this is a hard failure during parsing/compilation). [1][2]
  • {{ env_var('MY_VAR', 'some_default') }} → if MY_VAR is not set, dbt returns the provided default instead of erroring. [3]

So “when variable not set, no default” behavior = fail fast with an exception/error. [1][2]

Sources:
[1] https://airflow.apache.org/docs/apache-airflow-providers-openlineage/2.6.1/_modules/tests/system/openlineage/operator.html
[2] https://stackoverflow.com/questions/76653253/env-var-function-in-dbt-project-is-not-behaving-as-expected
[3] https://docs.open-metadata.org/latest/connectors/ingestion/workflows/dbt/auto-ingest-dbt-core


🏁 Script executed:

# Check if there are more usages of env_var in templates or documentation
rg "env_var" integration_tests/ -t yml -t yaml -C 3

Repository: elementary-data/dbt-data-reliability

Length of output: 107


🏁 Script executed:

# Check for all env_var usages in .j2 files more carefully
rg "env_var" integration_tests/profiles/

Repository: elementary-data/dbt-data-reliability

Length of output: 305


🏁 Script executed:

# Check if MSSQL_SA_PASSWORD or similar env vars are expected to be set
grep -r "MSSQL_SA_PASSWORD\|env_var" integration_tests/ --include="*.py" --include="*.md"

Repository: elementary-data/dbt-data-reliability

Length of output: 209


🏁 Script executed:

# See the full context of the function to understand when secrets are vs aren't loaded
sed -n '80,134p' integration_tests/profiles/generate_profiles.py | cat -n

Repository: elementary-data/dbt-data-reliability

Length of output: 2568


🏁 Script executed:

# Check all env_var calls in the entire codebase to see if any have explicit defaults
rg "env_var\(" --type py --type yaml -C 2

Repository: elementary-data/dbt-data-reliability

Length of output: 379


🏁 Script executed:

# Also check .j2 files specifically
fd "\.j2$" integration_tests/ | xargs cat | grep -n "env_var"

Repository: elementary-data/dbt-data-reliability

Length of output: 135


Preserve required env_var semantics instead of silently returning empty string.

At line 126, missing env vars silently resolve to "" even when no default is provided. The template at line 73 calls env_var('MSSQL_SA_PASSWORD') without an explicit default, which should fail in strict mode (when secrets are loaded) per dbt's documented behavior. The current implementation bypasses StrictUndefined's fail-fast guarantee.

🔧 Suggested fix
+_MISSING = object()
+
+def _env_var(key: str, default: Any = _MISSING) -> str:
+    value = os.environ.get(key)
+    if value is None:
+        if default is _MISSING:
+            raise click.ClickException(f"Missing required environment variable: {key}")
+        return default
+    return value
+
@@
-    env.globals["env_var"] = lambda key, default="": os.environ.get(key, default)
+    env.globals["env_var"] = _env_var
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@integration_tests/profiles/generate_profiles.py` at line 126, The current
env.globals["env_var"] lambda silently returns "" when no default is provided;
replace it with a real function (e.g., env_var) that distinguishes between an
omitted default and an explicit default: if a default is supplied, return
os.environ.get(key, default); if no default was supplied and the environment
variable is missing, raise an error (e.g., KeyError or a clear RuntimeError) so
templates like env_var('MSSQL_SA_PASSWORD') fail fast in strict mode; update the
reference to env.globals["env_var"] to point to this new function.

Comment on lines +270 to +279
{% if method == "cli" %} {% do parameters.append(_parameter("method", "cli")) %}
{% elif method == "service-principal" %}
{% do parameters.append(_parameter("method", "service-principal")) %}
{% do parameters.append(_parameter("tenant_id", "<TENANT_ID>")) %}
{% do parameters.append(_parameter("client_id", "<CLIENT_ID>")) %}
{% do parameters.append(_parameter("client_secret", "<CLIENT_SECRET>")) %}
{% else %}
{% do parameters.append(_parameter("user", target.user)) %}
{% do parameters.append(_parameter("password", "<PASSWORD>")) %}
{% endif %}
Copy link

@coderabbitai coderabbitai bot Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid silently treating unknown auth methods as username/password.

Right now, any typo or unsupported method value falls into the else branch (Line 276) and generates user/password, which can hide config mistakes and produce confusing failures. Please handle explicit default (method is none) separately and surface unsupported methods as placeholders/instructions.

Suggested fix
-    {% if method == "cli" %} {% do parameters.append(_parameter("method", "cli")) %}
+    {% if method == "cli" %} {% do parameters.append(_parameter("method", "cli")) %}
     {% elif method == "service-principal" %}
         {% do parameters.append(_parameter("method", "service-principal")) %}
         {% do parameters.append(_parameter("tenant_id", "<TENANT_ID>")) %}
         {% do parameters.append(_parameter("client_id", "<CLIENT_ID>")) %}
         {% do parameters.append(_parameter("client_secret", "<CLIENT_SECRET>")) %}
-    {% else %}
+    {% elif method is none %}
         {% do parameters.append(_parameter("user", target.user)) %}
         {% do parameters.append(_parameter("password", "<PASSWORD>")) %}
+    {% else %}
+        {% do parameters.append(
+            _parameter(
+                "method",
+                "<AUTH_METHOD>",
+                "Supported methods: cli, service-principal, or omit for username/password authentication.",
+            )
+        ) %}
     {% endif %}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{% if method == "cli" %} {% do parameters.append(_parameter("method", "cli")) %}
{% elif method == "service-principal" %}
{% do parameters.append(_parameter("method", "service-principal")) %}
{% do parameters.append(_parameter("tenant_id", "<TENANT_ID>")) %}
{% do parameters.append(_parameter("client_id", "<CLIENT_ID>")) %}
{% do parameters.append(_parameter("client_secret", "<CLIENT_SECRET>")) %}
{% else %}
{% do parameters.append(_parameter("user", target.user)) %}
{% do parameters.append(_parameter("password", "<PASSWORD>")) %}
{% endif %}
{% if method == "cli" %} {% do parameters.append(_parameter("method", "cli")) %}
{% elif method == "service-principal" %}
{% do parameters.append(_parameter("method", "service-principal")) %}
{% do parameters.append(_parameter("tenant_id", "<TENANT_ID>")) %}
{% do parameters.append(_parameter("client_id", "<CLIENT_ID>")) %}
{% do parameters.append(_parameter("client_secret", "<CLIENT_SECRET>")) %}
{% elif method is none %}
{% do parameters.append(_parameter("user", target.user)) %}
{% do parameters.append(_parameter("password", "<PASSWORD>")) %}
{% else %}
{% do parameters.append(
_parameter(
"method",
"<AUTH_METHOD>",
"Supported methods: cli, service-principal, or omit for username/password authentication.",
)
) %}
{% endif %}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/cross_db_utils/generate_elementary_profile_args.sql` around
lines 270 - 279, The template currently treats any non-matching method
(including typos) as username/password; update the logic in
generate_elementary_profile_args.sql so that you handle three cases explicitly:
when method is none (append no auth params or a clear placeholder like
_parameter("method", "<NONE>")), when method == "cli" or "service-principal"
(keep the existing branches that append parameters using _parameter and
parameters.append), and a new else branch for unsupported methods that appends a
clear placeholder/instruction parameter (e.g., _parameter("method",
"<UNSUPPORTED: specify 'cli' or 'service-principal' or 'none'>")) rather than
silently inserting user/password; reference the symbols method,
parameters.append, and _parameter to locate and change the conditional branches.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin wdyt?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haritamar — the suggestion is valid from a correctness standpoint: any typo or unsupported method value silently generates a user/password block, which could confuse users who pass an unexpected string and then wonder why their intended auth method isn't reflected in the generated profile.

That said, a couple of points worth considering before deciding:

  1. Consistency with fabric__: The existing fabric__generate_elementary_profile_args in the same file uses the exact same else → user/password pattern. If you fix sqlserver__, you'd want to fix fabric__ too to stay consistent — or leave both as-is.

  2. Precedent in the file: trino__ and bigquery__ do use an else branch that emits a placeholder/instruction (e.g., "<AUTH_METHOD>" with a comment pointing to docs). So the pattern I suggested is already established in the codebase.

  3. Risk level: The impact is low since this only affects the generated profile file that users then edit manually — it won't cause a runtime failure. But it could be a source of subtle confusion.

My recommendation: If the team is comfortable with it, apply the fix to both sqlserver__ and fabric__ together for consistency. If you'd prefer to keep things simple for now, it's reasonable to defer — the risk is minor.


🧠 Learnings used
Learnt from: haritamar
Repo: elementary-data/dbt-data-reliability PR: 827
File: macros/utils/cross_db_utils/generate_elementary_profile_args.sql:30-31
Timestamp: 2025-07-27T11:33:35.918Z
Learning: In the Elementary dbt package, the Snowflake profile generation macro intentionally always includes private_key and private_key_passphrase parameters regardless of the authentication method. This is a strategic decision to deprecate password authentication and guide all new users toward keypair authentication, even if it means showing keypair parameters when they might not currently be needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend deferring this — the risk is low (only affects generated profile files users edit manually), and trino__/bigquery__ already use the same else → placeholder pattern. If you want to fix it, we should fix both fabric__ and sqlserver__ together for consistency. Happy to do it if you want, but it's not a bug.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped: comment is from another GitHub bot.

…dev, ORDER BY, GROUP BY, buckets_cte)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
macros/edr/data_monitoring/monitors/column_numeric_monitors.sql (1)

69-79: ⚠️ Potential issue | 🟠 Major

Add sqlserver__variance macro — SQL Server uses VAR, not variance.

The variance macro lacks a SQL Server-specific implementation. T-SQL does not have a variance() function; it uses VAR() for sample variance or VARP() for population variance. This will cause a runtime error when the variance monitor runs against a SQL Server or Azure Synapse Fabric target.

Proposed fix
 {% macro clickhouse__variance(column_name) -%}
     varSamp(cast({{ column_name }} as Nullable({{ elementary.edr_type_float() }})))
 {%- endmacro %}
+
+{# T-SQL uses VAR instead of variance #}
+{% macro sqlserver__variance(column_name) -%}
+    var(cast({{ column_name }} as {{ elementary.edr_type_float() }}))
+{%- endmacro %}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/edr/data_monitoring/monitors/column_numeric_monitors.sql` around lines
69 - 79, The variance macro needs a SQL Server-specific implementation: add a
macro named sqlserver__variance that returns VAR(cast({{ column_name }} as {{
elementary.edr_type_float() }})) (use VAR for sample variance in T-SQL),
mirroring how default__variance and clickhouse__variance cast the column to the
float edr type; ensure the new macro follows the same macro signature {% macro
sqlserver__variance(column_name) -%} so adapter.dispatch("variance",
"sqlserver") will resolve correctly.
🧹 Nitpick comments (1)
macros/edr/tests/test_execution_sla.sql (1)

292-300: The when 1 = 1 pattern works but could be clearer.

The conditional logic is correct—when the deadline hasn't passed, the when 1 = 1 clause acts as a catch-all to return false before reaching else. However, this pattern may confuse future readers.

Consider restructuring for clarity:

💡 Alternative approach using Jinja conditional
                 case
                     when sla_status = 'MET_SLA'
                     then {{ elementary.edr_boolean_literal(false) }}
-                    {# If deadline hasn't passed, don't fail yet #}
-                    {% if not deadline_passed %}
-                        when 1 = 1 then {{ elementary.edr_boolean_literal(false) }}
-                    {% endif %}
-                    else {{ elementary.edr_boolean_literal(true) }}
+                    else {{ elementary.edr_boolean_literal(deadline_passed) }}
                 end as is_failure,

This directly expresses the intent: fail only if the deadline has passed and the SLA was not met.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/edr/tests/test_execution_sla.sql` around lines 292 - 300, The current
"when 1 = 1" catch-all is confusing; remove that pattern and instead use a Jinja
conditional to inject a clear when-clause only when deadline_passed is false so
intent is explicit. Update the CASE that produces is_failure (referencing
sla_status, is_failure, and elementary.edr_boolean_literal) so that you keep
"when sla_status = 'MET_SLA' then false", and then conditionally add a
descriptive when-clause like "when deadline has not passed then false" via {% if
not deadline_passed %} ... {% endif %} rather than using "when 1 = 1".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@macros/edr/system/system_utils/empty_table.sql`:
- Around line 242-255: Add a fabric-specific override macro similar to
sqlserver__dummy_values to avoid falling back to default__dummy_values (which
uses the invalid T-SQL literal "True"); create a macro named
fabric__dummy_values() that returns the same dummy_values mapping used in
sqlserver__dummy_values (notably setting "boolean": 1 and the same
string/int/float/timestamp values) so dbt's dummy_values() dispatch will pick
the Fabric-safe implementation instead of default__dummy_values.

---

Outside diff comments:
In `@macros/edr/data_monitoring/monitors/column_numeric_monitors.sql`:
- Around line 69-79: The variance macro needs a SQL Server-specific
implementation: add a macro named sqlserver__variance that returns VAR(cast({{
column_name }} as {{ elementary.edr_type_float() }})) (use VAR for sample
variance in T-SQL), mirroring how default__variance and clickhouse__variance
cast the column to the float edr type; ensure the new macro follows the same
macro signature {% macro sqlserver__variance(column_name) -%} so
adapter.dispatch("variance", "sqlserver") will resolve correctly.

---

Nitpick comments:
In `@macros/edr/tests/test_execution_sla.sql`:
- Around line 292-300: The current "when 1 = 1" catch-all is confusing; remove
that pattern and instead use a Jinja conditional to inject a clear when-clause
only when deadline_passed is false so intent is explicit. Update the CASE that
produces is_failure (referencing sla_status, is_failure, and
elementary.edr_boolean_literal) so that you keep "when sla_status = 'MET_SLA'
then false", and then conditionally add a descriptive when-clause like "when
deadline has not passed then false" via {% if not deadline_passed %} ... {%
endif %} rather than using "when 1 = 1".

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3452a2df-29ac-4ce9-87e1-dafa143b3dfa

📥 Commits

Reviewing files that changed from the base of the PR and between a0a7285 and 319dee6.

📒 Files selected for processing (15)
  • macros/edr/data_monitoring/monitors/column_numeric_monitors.sql
  • macros/edr/system/system_utils/buckets_cte.sql
  • macros/edr/system/system_utils/empty_table.sql
  • macros/edr/system/system_utils/get_config_var.sql
  • macros/edr/tests/test_execution_sla.sql
  • models/edr/alerts/alerts_anomaly_detection.sql
  • models/edr/alerts/alerts_dbt_models.sql
  • models/edr/alerts/alerts_dbt_source_freshness.sql
  • models/edr/alerts/alerts_dbt_tests.sql
  • models/edr/alerts/alerts_schema_changes.sql
  • models/edr/data_monitoring/anomaly_detection/anomaly_threshold_sensitivity.sql
  • models/edr/data_monitoring/anomaly_detection/metrics_anomaly_score.sql
  • models/edr/dbt_artifacts/dbt_artifacts_hashes.sql
  • models/edr/run_results/model_run_results.sql
  • models/edr/system/monitors_runs.sql
💤 Files with no reviewable changes (1)
  • models/edr/dbt_artifacts/dbt_artifacts_hashes.sql

…cursion

- Replace dbt_utils.group_by() with explicit column names in metrics_anomaly_score (T-SQL treats positional refs as literal constants)
- Fix make_temp_relation -> edr_make_temp_relation in test.sql and failed_row_count.sql
- Add temp table cleanup in failed_row_count.sql
- Add sqlserver__variance macro (T-SQL uses VAR not variance)
- Fix maxrecursion 10000 -> 0 for unlimited recursion in buckets_cte

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
macros/edr/system/system_utils/buckets_cte.sql (1)

234-256: ⚠️ Potential issue | 🔴 Critical

Recursive CTE remains incompatible with Fabric Warehouse runtime.

Line 241 introduces recursion; Fabric Warehouse currently documents recursive queries as unsupported. If this macro is part of Fabric support, this path is a runtime blocker.

What is the current Microsoft Fabric Warehouse support status for recursive CTEs/recursive queries in T-SQL? Please cite the latest T-SQL surface area docs.

Source referenced:

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/edr/system/system_utils/buckets_cte.sql` around lines 234 - 256, The
macro sqlserver__complete_buckets_cte uses a recursive CTE (timestamps) and
option (maxrecursion 0), which is incompatible with Fabric Warehouse; replace
the recursion with a non-recursive generator (e.g., use a numbers/tally table, a
system view that provides sequential integers, or a calendar table) to produce
edr_bucket_start/edr_bucket_end rows between min_bucket_start_expr and
max_bucket_end_expr stepping by time_bucket, and remove the option (maxrecursion
0); update the macro sqlserver__complete_buckets_cte to build the same sequence
using that iterative approach so Fabric-compatible T-SQL is emitted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@macros/edr/data_monitoring/monitors/column_numeric_monitors.sql`:
- Around line 64-67: The fabric adapter needs explicit implementations for
standard deviation and variance so adapter.dispatch finds them; add macro
definitions named fabric__standard_deviation and fabric__variance that either
call the existing sqlserver__standard_deviation and sqlserver__variance wrappers
or directly return T-SQL expressions using stdev(...) and var(cast(... as {{
elementary.edr_type_float() }})); ensure the new macros accept the same
parameter (column_name) and use the same casting helper
(elementary.edr_type_float) so Fabric resolves to the T-SQL behavior instead of
falling back to the default PostgreSQL-style stddev_pop/variance variants.

In `@macros/edr/materializations/test/failed_row_count.sql`:
- Around line 62-73: The temp table is dropped during macro rendering inside
sqlserver__get_failed_row_count_calc_query (the inline do run_query("drop table
if exists " ~ tmp_relation) runs before the returned SELECT is executed), so
move the cleanup to after the query result is fetched: have
sqlserver__get_failed_row_count_calc_query only create the temp table and return
the SELECT (without dropping), then perform run_query("drop table if exists " ~
tmp_relation) from the caller (get_failed_row_count) after calling
elementary.result_value(failed_row_count_query); alternatively change the macro
to return both the query string and a cleanup callable so the caller can execute
cleanup post-result.

---

Duplicate comments:
In `@macros/edr/system/system_utils/buckets_cte.sql`:
- Around line 234-256: The macro sqlserver__complete_buckets_cte uses a
recursive CTE (timestamps) and option (maxrecursion 0), which is incompatible
with Fabric Warehouse; replace the recursion with a non-recursive generator
(e.g., use a numbers/tally table, a system view that provides sequential
integers, or a calendar table) to produce edr_bucket_start/edr_bucket_end rows
between min_bucket_start_expr and max_bucket_end_expr stepping by time_bucket,
and remove the option (maxrecursion 0); update the macro
sqlserver__complete_buckets_cte to build the same sequence using that iterative
approach so Fabric-compatible T-SQL is emitted.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 144d403f-e137-44d4-b5ba-429e01db83fb

📥 Commits

Reviewing files that changed from the base of the PR and between 319dee6 and ba6bece.

📒 Files selected for processing (5)
  • macros/edr/data_monitoring/monitors/column_numeric_monitors.sql
  • macros/edr/materializations/test/failed_row_count.sql
  • macros/edr/materializations/test/test.sql
  • macros/edr/system/system_utils/buckets_cte.sql
  • models/edr/data_monitoring/anomaly_detection/metrics_anomaly_score.sql

devin-ai-integration bot and others added 2 commits March 4, 2026 17:05
- Renamed all sqlserver__ macros to fabric__ (primary for dbt-fabric production use)
- Added sqlserver__ wrapper macros that delegate to fabric__ (for dbt-sqlserver CI testing)
- Fixed temp table drop timing bug in failed_row_count.sql
- Added fabric__ variants for all T-SQL utility macros across 24 files

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…patch target

dbt-sqlserver declares dependencies=["fabric"] in its plugin config,
creating dispatch chain: sqlserver__ -> fabric__ -> default__.
This means fabric__ macros are automatically found by dbt-sqlserver
without explicit sqlserver__ wrappers (same pattern as redshift/postgres).

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (5)
macros/utils/data_types/try_cast_column_to_timestamp.sql (1)

30-41: ⚠️ Potential issue | 🟠 Major

Filter WHERE clause to ensure only successfully castable rows are evaluated.

The try_cast() function returns NULL on cast failure. Without filtering for successful casts, TOP 1 may non-deterministically select an un-castable row, causing result_value() to return NULL and incorrectly report false even when valid timestamp rows exist.

Add and try_cast({{ timestamp_column }} as {{ elementary.edr_type_timestamp() }}) is not null to the WHERE clause.

Proposed fix
 {% macro fabric__try_cast_column_to_timestamp(table_relation, timestamp_column) %}
     {%- set query %}
-        select top 1 try_cast({{ timestamp_column }} as {{ elementary.edr_type_timestamp() }}) as timestamp_column
+        select top 1 1 as has_valid_timestamp
         from {{ table_relation }}
         where {{ timestamp_column }} is not null
+          and try_cast({{ timestamp_column }} as {{ elementary.edr_type_timestamp() }}) is not null
     {%- endset %}

     {%- set result = elementary.result_value(query) %}
     {%- if result is not none %} {{ return(true) }} {%- endif %}
     {{ return(false) }}

 {% endmacro %}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/data_types/try_cast_column_to_timestamp.sql` around lines 30 -
41, The WHERE clause in macro fabric__try_cast_column_to_timestamp currently
only checks that {{ timestamp_column }} is not null, so TOP 1 can pick a row
where try_cast(...) is null and cause elementary.result_value(query) to return
null; update the WHERE to also require try_cast({{ timestamp_column }} as {{
elementary.edr_type_timestamp() }}) is not null so only successfully castable
rows are considered, keeping the rest of the query (select top 1 ... as
timestamp_column from {{ table_relation }}) and the use of
elementary.result_value unchanged.
macros/edr/system/system_utils/buckets_cte.sql (1)

241-253: ⚠️ Potential issue | 🔴 Critical

Recursive CTE is still incompatible with Fabric runtime.

Line 241 introduces a recursive CTE, but Fabric Warehouse currently lists recursive queries as unsupported, so this macro can still fail on its target adapter despite fixing the alias and recursion hint. Please switch this macro to a non-recursive bucket generator (e.g., tally/numbers via VALUES cross joins) for fabric__.
Source: https://learn.microsoft.com/en-us/fabric/data-warehouse/tsql-surface-area

In the latest Microsoft Fabric Data Warehouse T-SQL surface area documentation, are recursive queries/recursive CTEs supported in Warehouse?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/edr/system/system_utils/buckets_cte.sql` around lines 241 - 253, The
recursive CTE named "timestamps" (using {{ min_bucket_start_expr }}, {{
bucket_end_expr }}, {{ max_bucket_end_expr }}) is incompatible with the Fabric
runtime; replace the recursive generator with a non-recursive numbers/tally
approach for the fabric__ adapter: detect fabric__ and produce a bucket sequence
by cross joining a VALUES-derived numbers table (or repeated VALUES blocks)
multiplied to cover the max range, then compute edr_bucket_start and
edr_bucket_end from those generated offsets instead of the recursive CTE so the
macro avoids recursion and preserves the existing {{ ... }} expressions.
macros/utils/cross_db_utils/incremental_strategy.sql (1)

23-25: ⚠️ Potential issue | 🟠 Major

Line 24 hardcodes Fabric to merge, but the shared strategy contract should remain adapter-default (none) for non-Athena paths.

Please return none here unless Fabric has a documented exception with tests proving why it must diverge.

Suggested change
 {%- macro fabric__get_default_incremental_strategy() %}
-    {% do return("merge") %}
+    {% do return(none) %}
 {% endmacro %}

Based on learnings: “implement that incremental_strategy should return none (to use the adapter default) for all adapters except Athena, where get_append_only_incremental_strategy() is used.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/cross_db_utils/incremental_strategy.sql` around lines 23 - 25,
The macro fabric__get_default_incremental_strategy currently hardcodes "merge";
change it to return the adapter-default "none" for all adapters except Athena:
inside fabric__get_default_incremental_strategy() check the current adapter
(e.g., inspect target.adapter or target.type) and if it's Athena call/return
get_append_only_incremental_strategy() (or the Athena-specific strategy helper),
otherwise return "none" so non-Athena paths use the adapter default.
macros/utils/cross_db_utils/to_char.sql (1)

27-32: ⚠️ Potential issue | 🟠 Major

Line 29 injects format directly into CONVERT, which can break Fabric SQL generation.

fabric__edr_to_char currently renders CONVERT(varchar, <expr>, {{ format }}) with raw SQL at Line 29. If callers pass cross-db format patterns, this can fail at runtime or diverge from the macro contract used by other adapters. Please normalize this path (e.g., explicitly support one format contract and enforce it in the macro).

In SQL Server / Microsoft Fabric Warehouse T-SQL, what are valid values for the third argument of CONVERT(varchar, datetime_expr, style)? Does passing a pattern string like YYYY-MM-DD work, or must it be a style code?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/cross_db_utils/to_char.sql` around lines 27 - 32, The macro
fabric__edr_to_char currently injects the caller-supplied format directly into
CONVERT which can break Fabric SQL; update fabric__edr_to_char to enforce a
single format contract by accepting only integer T-SQL style codes (or mapping a
small set of common pattern strings to their style codes) and render the third
CONVERT argument as an integer literal; specifically, modify fabric__edr_to_char
to: validate if the provided format is an integer literal and output it
unquoted, else if it's a known pattern (e.g., 'YYYY-MM-DD', 'YYYY-MM-DD
HH24:MI:SS') map it to the corresponding style code and use that code, otherwise
raise a clear jinja/dbt error so callers cannot inject arbitrary pattern strings
into CONVERT.
macros/utils/cross_db_utils/hour_of_week.sql (1)

81-85: ⚠️ Potential issue | 🟠 Major

Line 83 uses locale-dependent weekday names, making hour-of-week keys non-deterministic.

datename(weekday, ...) can return different strings across sessions/environments. For stable grouping keys, switch to a deterministic weekday mapping (independent of session language) before concatenating with the hour.

Suggested deterministic replacement
 {% macro fabric__edr_hour_of_week_expression(date_expr) %}
     concat(
-        cast(datename(weekday, {{ date_expr }}) as {{ elementary.edr_type_string() }}),
+        cast(
+            case (datediff(day, cast('19000101' as date), cast({{ date_expr }} as date)) % 7)
+                when 0 then 'Monday'
+                when 1 then 'Tuesday'
+                when 2 then 'Wednesday'
+                when 3 then 'Thursday'
+                when 4 then 'Friday'
+                when 5 then 'Saturday'
+                when 6 then 'Sunday'
+            end as {{ elementary.edr_type_string() }}
+        ),
         cast(datepart(hour, {{ date_expr }}) as {{ elementary.edr_type_string() }})
     )
 {% endmacro %}
In SQL Server / Microsoft Fabric Warehouse, is DATENAME(weekday, date_expr) dependent on session language (SET LANGUAGE / @@LANGUAGE)? If yes, what deterministic alternatives are recommended for stable weekday keys?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/cross_db_utils/hour_of_week.sql` around lines 81 - 85, The macro
fabric__edr_hour_of_week_expression uses DATENAME(weekday, {{ date_expr }})
which returns locale-dependent names; replace that with a deterministic numeric
weekday computed from a fixed epoch (e.g. DATEDIFF(day, 0, {{ date_expr }}) % 7)
so the key is stable across sessions, e.g. cast((DATEDIFF(day, 0, {{ date_expr
}}) % 7) as {{ elementary.edr_type_string() }}) concatenated with the existing
cast(datepart(hour, {{ date_expr }}) as {{ elementary.edr_type_string() }}).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@macros/edr/system/system_utils/buckets_cte.sql`:
- Around line 241-253: The recursive CTE named "timestamps" (using {{
min_bucket_start_expr }}, {{ bucket_end_expr }}, {{ max_bucket_end_expr }}) is
incompatible with the Fabric runtime; replace the recursive generator with a
non-recursive numbers/tally approach for the fabric__ adapter: detect fabric__
and produce a bucket sequence by cross joining a VALUES-derived numbers table
(or repeated VALUES blocks) multiplied to cover the max range, then compute
edr_bucket_start and edr_bucket_end from those generated offsets instead of the
recursive CTE so the macro avoids recursion and preserves the existing {{ ... }}
expressions.

In `@macros/utils/cross_db_utils/hour_of_week.sql`:
- Around line 81-85: The macro fabric__edr_hour_of_week_expression uses
DATENAME(weekday, {{ date_expr }}) which returns locale-dependent names; replace
that with a deterministic numeric weekday computed from a fixed epoch (e.g.
DATEDIFF(day, 0, {{ date_expr }}) % 7) so the key is stable across sessions,
e.g. cast((DATEDIFF(day, 0, {{ date_expr }}) % 7) as {{
elementary.edr_type_string() }}) concatenated with the existing
cast(datepart(hour, {{ date_expr }}) as {{ elementary.edr_type_string() }}).

In `@macros/utils/cross_db_utils/incremental_strategy.sql`:
- Around line 23-25: The macro fabric__get_default_incremental_strategy
currently hardcodes "merge"; change it to return the adapter-default "none" for
all adapters except Athena: inside fabric__get_default_incremental_strategy()
check the current adapter (e.g., inspect target.adapter or target.type) and if
it's Athena call/return get_append_only_incremental_strategy() (or the
Athena-specific strategy helper), otherwise return "none" so non-Athena paths
use the adapter default.

In `@macros/utils/cross_db_utils/to_char.sql`:
- Around line 27-32: The macro fabric__edr_to_char currently injects the
caller-supplied format directly into CONVERT which can break Fabric SQL; update
fabric__edr_to_char to enforce a single format contract by accepting only
integer T-SQL style codes (or mapping a small set of common pattern strings to
their style codes) and render the third CONVERT argument as an integer literal;
specifically, modify fabric__edr_to_char to: validate if the provided format is
an integer literal and output it unquoted, else if it's a known pattern (e.g.,
'YYYY-MM-DD', 'YYYY-MM-DD HH24:MI:SS') map it to the corresponding style code
and use that code, otherwise raise a clear jinja/dbt error so callers cannot
inject arbitrary pattern strings into CONVERT.

In `@macros/utils/data_types/try_cast_column_to_timestamp.sql`:
- Around line 30-41: The WHERE clause in macro
fabric__try_cast_column_to_timestamp currently only checks that {{
timestamp_column }} is not null, so TOP 1 can pick a row where try_cast(...) is
null and cause elementary.result_value(query) to return null; update the WHERE
to also require try_cast({{ timestamp_column }} as {{
elementary.edr_type_timestamp() }}) is not null so only successfully castable
rows are considered, keeping the rest of the query (select top 1 ... as
timestamp_column from {{ table_relation }}) and the use of
elementary.result_value unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3060fa55-2f7a-4ceb-823a-810bfc841aa9

📥 Commits

Reviewing files that changed from the base of the PR and between ba6bece and d489172.

📒 Files selected for processing (24)
  • macros/edr/data_monitoring/monitors/column_numeric_monitors.sql
  • macros/edr/materializations/test/failed_row_count.sql
  • macros/edr/materializations/test/test.sql
  • macros/edr/system/system_utils/buckets_cte.sql
  • macros/edr/system/system_utils/empty_table.sql
  • macros/edr/system/system_utils/get_config_var.sql
  • macros/utils/cross_db_utils/current_timestamp.sql
  • macros/utils/cross_db_utils/date_trunc.sql
  • macros/utils/cross_db_utils/dateadd.sql
  • macros/utils/cross_db_utils/datediff.sql
  • macros/utils/cross_db_utils/day_of_week.sql
  • macros/utils/cross_db_utils/generate_elementary_profile_args.sql
  • macros/utils/cross_db_utils/hour_of_day.sql
  • macros/utils/cross_db_utils/hour_of_week.sql
  • macros/utils/cross_db_utils/incremental_strategy.sql
  • macros/utils/cross_db_utils/target_database.sql
  • macros/utils/cross_db_utils/time_trunc.sql
  • macros/utils/cross_db_utils/timeadd.sql
  • macros/utils/cross_db_utils/to_char.sql
  • macros/utils/data_types/data_type.sql
  • macros/utils/data_types/data_type_list.sql
  • macros/utils/data_types/get_normalized_data_type.sql
  • macros/utils/data_types/try_cast_column_to_timestamp.sql
  • macros/utils/table_operations/insert_rows.sql
🚧 Files skipped from review as they are similar to previous changes (7)
  • macros/edr/data_monitoring/monitors/column_numeric_monitors.sql
  • macros/edr/system/system_utils/get_config_var.sql
  • macros/edr/system/system_utils/empty_table.sql
  • macros/utils/cross_db_utils/dateadd.sql
  • macros/utils/data_types/get_normalized_data_type.sql
  • macros/utils/cross_db_utils/generate_elementary_profile_args.sql
  • macros/utils/cross_db_utils/date_trunc.sql

devin-ai-integration bot and others added 6 commits March 4, 2026 17:24
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
- Add fabric__edr_type_long_string returning varchar(max) to avoid truncation
- Add fabric__event_freshness_metric_query with explicit GROUP BY columns
- Fix LIMIT→TOP in dimension_anomalies test query for T-SQL
- Fix boolean literals in column_boolean_monitors.sql
- Fix INSERT INTO with CTE pattern in insert_as_select.sql
- Fix CTE-in-subquery and CTE-in-IN-clause in get_anomaly_query.sql
- All 18 anomalyless integration tests pass locally against SQL Server

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…erflow in CI

SQL Server limits identifiers to 128 chars. When dbt-sqlserver creates
relations, the schema name (~60 chars in CI) can be prefixed onto the
table identifier. Reducing from 80 to 60 ensures the combined name
stays within the 128-char limit even with long CI schema names.

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…MIT→TOP for Fabric/T-SQL

- Add fabric__create_temp_table using view-based approach to avoid CTE-in-subquery errors
- Force Fabric/sqlserver targets to use temp table path in test materialization
- Add fabric__edr_make_temp_relation to handle dict model nodes
- Add 'string' to fabric data_type_list for proper normalization
- Add STRING→VARCHAR mapping in fabric__get_normalized_data_type
- Convert hardcoded LIMIT 1 to TOP 1 in integration test SQL templates
- Apply pre-commit formatting (black, sqlfmt)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…n INSERT VALUES

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
devin-ai-integration bot and others added 6 commits March 4, 2026 22:53
…edr_boolean_literal

- get_columns_snapshot_query: replace inline target.type checks with edr_boolean_literal()
- get_columns_changes_query: extract to dispatched macro (fabric__/default__get_column_changes_from_baseline_cur), use edr_boolean_literal, restore dbt_utils.group_by(9) to fix test_schema_changes failures
- column_monitoring_query: extract GROUP BY to dispatched column_monitoring_group_by macro
- insert_as_select: convert to dispatched macro (fabric__/default__insert_as_select)
- get_anomaly_query: convert both macros to dispatched (fabric__/default__ variants)
- create_temp_table: simplify DROP VIEW IF EXISTS syntax
- Rename CI target from 'fabric' to 'sqlserver' (honest about what we test)
- Add sqlserver__generate_elementary_profile_args (separate from fabric__)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
… compatibility

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…nges (fixes Redshift)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…ting

- Cast datetime2 to explicit precision (6) to avoid unsupported datetime2(7)
- Cast sysutcdatetime()/getdate() to datetime2(6) in current_timestamp macros
- Rewrite fabric__insert_as_select to use temp view pattern (Fabric rejects INSERT...EXEC)
- Replace CONCAT() with + operator in full_names macros to avoid nvarchar return type
- Cast PARSENAME results to varchar in fabric__full_name_split
- Cast datename() result to varchar in fabric__edr_day_of_week_expression
- Add fabric__generate_surrogate_key using + operator instead of dbt.concat()
- Add fabric__schema_change_description_column with varchar cast
- Add fabric__edr_list_schemas and fabric__edr_schema_exists using sys.schemas

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add fabric target to profiles.yml.j2 with ServicePrincipal auth
- Add fabric to cloud targets matrix in test-all-warehouses.yml
- Install ODBC driver for fabric target in test-warehouse.yml
- Include fabric in post-test schema cleanup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
devin-ai-integration bot and others added 7 commits March 7, 2026 14:03
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…ric__ counterparts

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…s instead of SQL text

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…turning non-string values

Fixes 'unsupported operand type(s) for -: str and int' in fabric__edr_make_temp_relation
when called via sqlserver__edr_make_temp_relation. The {{ }} pattern stringifies return
values (e.g. integers, Relations), breaking downstream arithmetic and method calls.

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
dbt-sqlserver declares dependencies=['fabric'], so the dispatch chain
sqlserver__ -> fabric__ -> default__ handles everything automatically.

Only sqlserver__ macros with genuinely different implementations are kept:
- sqlserver__target_database (needed by CLI)
- sqlserver__create_temp_table (T-SQL CTE workaround)
- sqlserver__generate_elementary_profile_args (different connection params)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…WORD env var

Matches how postgres, clickhouse, and dremio handle local credentials.
- Hardcode password in docker-compose-sqlserver.yml
- Hardcode password in profiles.yml.j2 (sqlserver target)
- Remove MSSQL_SA_PASSWORD from workflow env, Start SQL Server step, and Write dbt profiles step
- Remove env_var() helper from generate_profiles.py (no longer needed)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
assert test_result["status"] == "pass"

ts_type = (
"datetime2" if dbt_project.target in ("fabric", "sqlserver") else "timestamp"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have an edr_timestamp or something like that, could we use it? Not sure

{% endmacro %}

{% macro freshness_description() %}
{% set c = " + " if target.type in ["fabric", "sqlserver"] else " || " %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use dbt.concat instead like we did elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes — switching to dbt.concat() now. It'll handle the + vs || dispatch automatically.

{% endmacro %}

{% macro table_metric_description() %}
{% set c = " + " if target.type in ["fabric", "sqlserver"] else " || " %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

{% do new_list.append(elementary.edr_quote(separator)) %}
{% endif %}
{% endfor %}
{{ return(elementary.join_list(new_list, " + ")) }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use dbt.concat to avoid duplicating the macro?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — with the new edr_concat macro, the fabric__ variant can use edr_concat which handles the varchar cast, and the join operator becomes dbt.concat-compatible. Will refactor to use dbt.concat / edr_concat.

- Use dbt.concat() in anomaly_detection_description.sql instead of inline || / +
- Use edr_boolean_literal() in get_anomaly_scores_query.sql instead of inline target.type checks
- Remove confusing T-SQL comment from get_anomaly_query.sql
- Fix try_cast_column_to_timestamp WHERE clause bug (check try_cast is not null)
- Create edr_dbt_concat dispatched macro wrapping dbt.concat with varchar cast on Fabric
- Use edr_dbt_concat in schema_change_description_column (remove fabric__ variant)
- Use edr_dbt_concat in list_concat_with_separator (remove fabric__ variant)
- Use edr_dbt_concat in generate_surrogate_key (remove fabric__ variant)
- Unify GROUP BY to use named columns for all adapters (column_monitoring, table_monitoring)
- Add SelectLimit helper class and DbtProject.select_limit() for cross-adapter LIMIT/TOP
- Update 9 test files to use SelectLimit instead of inline is_tsql checks
- Use DbtProject.is_tsql property in test_anomaly_exclude_metrics.py

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
count(case when not should_exclude_from_training then metric_value end) over (partition by {{ partition_by_keys }} order by bucket_end asc rows between unbounded preceding and current row) as training_set_size,
last_value(case when not should_exclude_from_training then bucket_end end) over (partition by {{ partition_by_keys }} order by bucket_end asc rows between unbounded preceding and current row) training_end,
first_value(case when not should_exclude_from_training then bucket_end end) over (partition by {{ partition_by_keys }} order by bucket_end asc rows between unbounded preceding and current row) as training_start
{%- set exclude_filter = 'should_exclude_from_training = ' ~ elementary.edr_boolean_literal(false) -%}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this set above the query (before the outer set)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — moved {% set exclude_filter %} above the {% set anomaly_scores_query %} block so it's set before the query starts.

{#- T-SQL does not support positional GROUP BY references and rejects
GROUP BY on constant expressions (e.g. 'schema_change').
Group by the 6 source columns from all_column_changes instead;
all 9 output columns are deterministic functions of these. -#}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, is the separate macro with dispatch still required?

test_unique_id ~ ": starting test materialization hook"
) %}
{% if elementary.get_config_var("tests_use_temp_tables") %}
{% if target.type in ("fabric", "sqlserver") or elementary.get_config_var(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create an is_tsql macro?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — created elementary.is_tsql() as a dispatched macro (default__ returns false, fabric__ returns true). Updated this line to use it. Since dbt-sqlserver dispatches through fabric__, it also returns true automatically.


{# Fabric/T-SQL: Python booleans (True/False) pass Jinja's "is number" test
but T-SQL has no TRUE/FALSE keywords, so they must be rendered as
cast(1 as bit) / cast(0 as bit). #}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only reason for the override?
Could we use the edr_type_boolean macro that you added and merge the implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — yes, the only reason for the override was boolean rendering. Merged the fabric__render_value into default__render_value by adding a value is boolean check that uses edr_boolean_literal(value). This handles T-SQL via dispatch (cast(1/0 as bit)) while keeping true/false for standard SQL. Removed the separate fabric__render_value entirely.

devin-ai-integration bot and others added 3 commits March 7, 2026 23:02
…ue, remove dispatch

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
TEXT is deprecated in SQL Server and dbt-sqlserver creates string columns
as varchar. Without this normalization, test_schema_changes_from_baseline
detects a spurious type_changed for columns with TEXT baseline vs VARCHAR
actual type. This matches the fabric__ variant's behavior.

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…nges query

Dremio requires positional GROUP BY (dbt_utils.group_by(9)) because it
is strict about detected_at column ambiguity (both a source column and
a SELECT alias). Use is_tsql() to conditionally apply named GROUP BY
for T-SQL (which doesn't support positional references) and positional
GROUP BY for all other adapters.

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Copy link
Collaborator

@haritamar haritamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Microsoft Fabric DWH Integration

Nice work overall — the architectural approach of using fabric__ dispatched macros with sqlserver auto-dispatching via dependencies=["fabric"] is clean. The cross-warehouse changes (positional GROUP BY → explicit columns, ||dbt.concat(), boolean literals) are the right call for universal compatibility. A few things worth addressing:


1. Fragile GROUP BY in metrics_anomaly_score.sql (High)

The GROUP BY now duplicates the full CASE WHEN expression for anomaly_score (~8 lines). If the anomaly score logic changes in the SELECT, the GROUP BY must change identically — easy to forget. Consider extracting the anomaly_score computation into a prior CTE so the GROUP BY just references a column name:

-- In a preceding CTE:
case when ... end as anomaly_score

-- Then the final CTE just does:
group by ..., anomaly_score

This would also make the explicit GROUP BY less verbose across the board.

2. fabric__insert_as_select — EXEC quote escaping (Medium)

The select_query | replace("'", "''") escaping inside EXEC('...') is minimal. If select_query ever contains string literals with already-escaped quotes (e.g. from user-defined test configs), this could double-escape and produce invalid SQL. Worth adding a comment noting this limitation, or testing with a query that contains single quotes.

3. fabric__query_test_result_rows fetches all rows (Medium)

The Fabric override runs the full query and slices in Jinja (result[:sample_limit]) instead of using TOP in SQL. For tests with large result sets this pulls everything into memory. I understand CTEs prevent adding TOP easily — would wrapping the CTE query into a SELECT TOP ... FROM (subquery) or materializing into a temp table first be feasible? If not, a comment explaining the tradeoff would help.

4. _fmt_samples_query duplicated across 6+ test files (Low)

The identical helper function appears in test_column_pii_sampling.py, test_disable_samples_config.py, test_override_samples_config.py, test_sampling.py, test_sampling_pii.py, and a variant in test_exposure_schema_validity.py. Consider extracting to a shared test utility (e.g. in dbt_project.py or conftest.py).

5. sqlserver__create_temp_table calls fabric__create_temp_table directly (Low)

This creates a hard coupling — if someone overrides fabric__create_temp_table, the sqlserver version won't pick it up. Consider extracting the shared T-SQL logic into a private helper macro (e.g. _tsql_create_temp_table) that both dispatches call.

6. edr_dbt_concat varchar(4096) cast (Low)

The Fabric override casts all concat arguments to varchar(4096). Worth adding a brief comment noting this as a known limitation for very long strings.

7. Unresolved review threads

A few threads where Devin acknowledged feedback but may not have pushed fixes:

  • schema_exists.sql — haritamar asked to use {% set query %} Jinja templating (non-outdated thread)
  • test_anomaly_exclude_metrics.py — question about edr_timestamp macro (no reply)
  • anomaly_detection_description.sql — 2 threads about using dbt.concat (Devin agreed, threads still open)
  • get_anomaly_scores_query.sql — using edr_boolean_literal() (Devin agreed, thread still open)

Worth verifying these are addressed in the latest commits before merging.


The cross-warehouse changes (GROUP BY, boolean literals, dbt.concat) are the right approach — universal syntax is cleaner than per-adapter branching. CI passing across all targets gives good confidence. Main concern is the metrics_anomaly_score.sql GROUP BY maintainability.

@devin-ai-integration
Copy link
Contributor Author

Thanks for the thorough review! Addressing all 7 items now:

  1. metrics_anomaly_score.sql GROUP BY — Will extract anomaly_score into time_window_aggregation CTE so the final GROUP BY just references a column name
  2. fabric__insert_as_select quote escaping — Will add a comment noting the limitation
  3. fabric__query_test_result_rows — Will add a comment explaining the tradeoff
  4. _fmt_samples_query duplication — Will extract to shared utility in dbt_project.py
  5. sqlserver__create_temp_table coupling — Will extract into a private _tsql_create_temp_table helper
  6. edr_dbt_concat varchar cast — Will add a comment
  7. Unresolved threads — Will verify each one is addressed

devin-ai-integration bot and others added 3 commits March 8, 2026 11:56
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…ned _tsql_create_temp_table

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
bucket_end,
{{ bucket_seasonality_expr }} as bucket_seasonality,
{{ test_configuration.anomaly_exclude_metrics or 'FALSE' }} as is_excluded,
case when {{ test_configuration.anomaly_exclude_metrics or '1=0' }} then {{ elementary.edr_boolean_literal(true) }} else {{ elementary.edr_boolean_literal(false) }} end as is_excluded,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have many instances of

case when <condition> then TRUE else FALSE

Can we make a generic macro for that? e.g. render_condition_as_boolean? (or if you think another name is better use it)
and probably it can default to false so the "or false" bit can be avoided as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — created edr_condition_as_boolean(condition, default_false=true) macro in get_config_var.sql. It renders case when <condition> then <TRUE> else <FALSE> end using edr_boolean_literal. When default_false=true (the default), a falsy condition produces FALSE without needing or '1=0'.

Applied across: get_anomaly_scores_query.sql, get_columns_snapshot_query.sql, anomaly_threshold_sensitivity.sql, model_run_results.sql, metrics_anomaly_score.sql, get_anomaly_query.sql.

case
when cast({{ column_name }} as {{ elementary.edr_type_bool() }}) = true
when
cast({{ column_name }} as {{ elementary.edr_type_bool() }})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's have edr_is_true and edr_is_false generic macros, and use them here / elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — created edr_is_true(expr) and edr_is_false(expr) macros in get_config_var.sql. They dispatch to handle T-SQL (= 1/= 0 for bit type) vs standard SQL (= true/= false). Applied here in column_boolean_monitors.sql.

{% endmacro %}

{% macro fabric__row_count_metric_query(metric, metric_properties) %}
{# Fabric / T-SQL: avoid CTE since this will be embedded inside a parent CTE. #}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just use this implementation as the default one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — made the CTE-free implementation the default default__row_count_metric_query. Removed fabric__row_count_metric_query since it's now identical to the default.

{% set tbl_ref = table_relation.include(database=false) %}

{% do elementary.run_query("DROP VIEW IF EXISTS " ~ vw_ref) %}
{% do elementary.run_query("CREATE VIEW " ~ vw_ref ~ " AS " ~ sql_query) %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember - would it be possible to have temporary views? (and save the need to drop them)

) -%}
{%- set insert_query %}
IF OBJECT_ID('{{ tmp_view_name }}', 'V') IS NOT NULL DROP VIEW {{ tmp_view_name }};
EXEC('CREATE VIEW {{ tmp_view_name }} AS {{ select_query | replace("'", "''") }}');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here - would it be possible to make the view temporary at the DB level, to avoid the need of dropping it?

anomaly_score,
case
when abs(anomaly_score) >= 1.5 then true else false
when abs(anomaly_score) >= 1.5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comment above about rendering a condition as boolean, maybe can simplify this and the below fields.

then true
else false
then {{ elementary.edr_boolean_literal(true) }}
else {{ elementary.edr_boolean_literal(false) }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see also my comment re. rendering condition as boolean

then true
else false
then {{ elementary.edr_boolean_literal(true) }}
else {{ elementary.edr_boolean_literal(false) }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comment re. rendering condition as boolean (here and below)

…ame edr_dbt_concat to edr_concat, consolidate full_names.sql, reduce table_monitoring_query duplication, optimize fabric test sampling with temp view + TOP

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
= {{ elementary.edr_boolean_literal(true) }}
{{
elementary.edr_is_true(
"cast("
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the cast a must? or can edr_is_true handle it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cast is needed — the column might not be a boolean type natively. Cleaned up the code: extracted the cast expression into a {% set bool_expr %} variable first, then pass it to edr_is_true/edr_is_false. Also switched count_false to use edr_is_false instead of negating edr_is_true.

= {{ elementary.edr_boolean_literal(true) }}
{{
elementary.edr_is_true(
"cast("
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

end as is_the_last_invocation_of_the_day
{{
elementary.edr_condition_as_boolean(
"first_value(invocation_id) over (partition by "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can first above compute this condition as a variable, e.g.

{% set <condition> %}
... jinja goes here ..
{% endset %}

I think it can be more readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — extracted the window spec and conditions into {% set %} variables:

{%- set day_partition = elementary.edr_time_trunc("day", "run_results.generated_at") -%}
{%- set day_window = "over (partition by " ~ day_partition ~ " order by ...)" -%}
{%- set first_inv_cond = "first_value(invocation_id) " ~ day_window ~ " = invocation_id" -%}
{%- set last_inv_cond = "last_value(invocation_id) " ~ day_window ~ " = invocation_id" -%}
{{ elementary.edr_condition_as_boolean(first_inv_cond) }} as is_the_first_invocation_of_the_day,
{{ elementary.edr_condition_as_boolean(last_inv_cond) }} as is_the_last_invocation_of_the_day

Much more readable now.

… conditions in model_run_results for readability

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
) -%}
{%- set last_inv_cond = (
"last_value(invocation_id) " ~ day_window ~ " = invocation_id"
) -%}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move all these sets to the top

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — moved all {% set %} variables to the top of the file (before the with clause).

devin-ai-integration bot and others added 4 commits March 8, 2026 17:48
…ability

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
… to avoid missing test audit schema

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…roach in default for all adapters

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…_result_rows

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
@haritamar haritamar merged commit 534afc6 into master Mar 8, 2026
26 checks passed
@haritamar haritamar deleted the devin/ELE-5282-1772640713 branch March 8, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant