See Architecture Diagram for the full medallion architecture flow.
See Star Schema Diagram for the Kimball star schema design.
See Data Flow Diagram for transformation details at each layer.
E-commerce businesses generate massive transaction data across orders, customers, products, and payments. Without a structured analytical warehouse, teams waste hours writing ad-hoc queries, get inconsistent numbers across reports, and can't trust the data for decision-making.
A production-ready dbt project that transforms raw e-commerce data into a Kimball star schema warehouse with:
- Medallion architecture: Raw → Staging → Intermediate → Marts
- Kimball dimensional modeling: Star schema with fact and dimension tables
- SCD Type 2: Track customer changes over time with snapshots
- 100% test coverage: Every model tested for quality
- Full documentation: Every column described and lineage tracked
- CI/CD: Automated testing on every pull request
| Tool | Purpose |
|---|---|
| dbt | SQL transformation, testing, documentation |
| Snowflake | Cloud data warehouse |
| SQL | Data transformation language |
| Jinja | Templating in dbt |
| GitHub Actions | CI/CD automation |
DBT_Ecommerce_Warehouse/
├── models/
│ ├── staging/ # Clean & standardize (4 views)
│ ├── intermediate/ # Join & enrich (2 ephemeral CTEs)
│ └── marts/ # Star schema (5 tables)
├── seeds/ # Raw CSV data (4 files)
├── snapshots/ # SCD Type 2 (1 snapshot)
├── tests/ # Custom singular tests (2)
├── macros/ # Reusable SQL functions (3)
├── analyses/ # Sample analytical queries (6)
├── docs/ # Architecture diagrams & guides
└── .github/workflows/ # CI/CD pipeline
- ✅ Kimball Star Schema (fact_orders, dim_customers, dim_products, dim_dates)
- ✅ SCD Type 2 via dbt snapshots for customer dimension
- ✅ 35+ data quality tests (unique, not_null, relationships, custom)
- ✅ Customer segmentation (RFM-inspired segments + revenue tiers)
- ✅ Product performance tiers and price tiers
- ✅ Pre-aggregated fact_daily_revenue for dashboard performance
- ✅ CI/CD pipeline with GitHub Actions
- ✅ Seed data included — fully reproducible
- Snowflake account (free 30-day trial: https://signup.snowflake.com/)
- dbt-core installed locally (
pip install dbt-snowflake) - Git and GitHub account
git clone https://github.com/YOUR_USERNAME/DBT_Ecommerce_Warehouse.git
cd DBT_Ecommerce_Warehouse
dbt deps # Install packages
dbt seed # Load sample data
dbt snapshot # Create SCD Type 2 snapshots
dbt run # Build all models
dbt test # Run all 35+ tests
dbt docs generate && dbt docs serve # View documentationSELECT date_trunc('month', order_date) AS month,
SUM(net_amount) AS revenue
FROM marts.fact_orders
WHERE is_completed = true
GROUP BY 1 ORDER BY 1;SELECT customer_segment, revenue_tier,
COUNT(*) AS customers,
SUM(lifetime_revenue) AS total_revenue
FROM marts.dim_customers
GROUP BY 1, 2 ORDER BY total_revenue DESC;SELECT product_name, category,
SUM(line_profit) AS total_profit,
ROUND(SUM(line_profit)/NULLIF(SUM(net_amount),0)*100, 2) AS margin_pct
FROM marts.fact_orders
WHERE is_completed = true
GROUP BY 1, 2 ORDER BY total_profit DESC LIMIT 10;- 📖 Setup Guide — Step-by-step from Snowflake signup to dbt docs serve
- 📊 Data Dictionary — Every table, column, and expected result documented
- 📐 Architecture Diagram — Medallion architecture flow
- ⭐ Star Schema Diagram — Kimball star schema design
- 🔄 Data Flow Diagram — Transformation details at each layer
- Processes 53 order line items across 30 orders and 20 customers
- 35+ automated tests — all passing
- Complete lineage from raw CSV to analytics-ready star schema
- Less than 30 second full build time
Built as a portfolio project demonstrating analytics engineering best practices.