Skip to content

charan3129/dbt-Analytics-Engineering-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛒 E-Commerce Analytics Data Warehouse (dbt + Snowflake)

Architecture

See Architecture Diagram for the full medallion architecture flow.

Star Schema

See Star Schema Diagram for the Kimball star schema design.

Data Flow

See Data Flow Diagram for transformation details at each layer.

Problem

E-commerce businesses generate massive transaction data across orders, customers, products, and payments. Without a structured analytical warehouse, teams waste hours writing ad-hoc queries, get inconsistent numbers across reports, and can't trust the data for decision-making.

Solution

A production-ready dbt project that transforms raw e-commerce data into a Kimball star schema warehouse with:

  • Medallion architecture: Raw → Staging → Intermediate → Marts
  • Kimball dimensional modeling: Star schema with fact and dimension tables
  • SCD Type 2: Track customer changes over time with snapshots
  • 100% test coverage: Every model tested for quality
  • Full documentation: Every column described and lineage tracked
  • CI/CD: Automated testing on every pull request

Tech Stack

Tool Purpose
dbt SQL transformation, testing, documentation
Snowflake Cloud data warehouse
SQL Data transformation language
Jinja Templating in dbt
GitHub Actions CI/CD automation

Project Structure

DBT_Ecommerce_Warehouse/
├── models/
│   ├── staging/          # Clean & standardize (4 views)
│   ├── intermediate/     # Join & enrich (2 ephemeral CTEs)
│   └── marts/            # Star schema (5 tables)
├── seeds/                # Raw CSV data (4 files)
├── snapshots/            # SCD Type 2 (1 snapshot)
├── tests/                # Custom singular tests (2)
├── macros/               # Reusable SQL functions (3)
├── analyses/             # Sample analytical queries (6)
├── docs/                 # Architecture diagrams & guides
└── .github/workflows/    # CI/CD pipeline

Key Features

  • ✅ Kimball Star Schema (fact_orders, dim_customers, dim_products, dim_dates)
  • ✅ SCD Type 2 via dbt snapshots for customer dimension
  • ✅ 35+ data quality tests (unique, not_null, relationships, custom)
  • ✅ Customer segmentation (RFM-inspired segments + revenue tiers)
  • ✅ Product performance tiers and price tiers
  • ✅ Pre-aggregated fact_daily_revenue for dashboard performance
  • ✅ CI/CD pipeline with GitHub Actions
  • ✅ Seed data included — fully reproducible

Quick Start

Prerequisites

  • Snowflake account (free 30-day trial: https://signup.snowflake.com/)
  • dbt-core installed locally (pip install dbt-snowflake)
  • Git and GitHub account

Setup

git clone https://github.com/YOUR_USERNAME/DBT_Ecommerce_Warehouse.git
cd DBT_Ecommerce_Warehouse
dbt deps       # Install packages
dbt seed       # Load sample data
dbt snapshot   # Create SCD Type 2 snapshots
dbt run        # Build all models
dbt test       # Run all 35+ tests
dbt docs generate && dbt docs serve  # View documentation

Sample Queries

Monthly Revenue Trend

SELECT date_trunc('month', order_date) AS month,
       SUM(net_amount) AS revenue
FROM marts.fact_orders
WHERE is_completed = true
GROUP BY 1 ORDER BY 1;

Customer Segmentation Distribution

SELECT customer_segment, revenue_tier,
       COUNT(*) AS customers,
       SUM(lifetime_revenue) AS total_revenue
FROM marts.dim_customers
GROUP BY 1, 2 ORDER BY total_revenue DESC;

Top Products by Profit

SELECT product_name, category,
       SUM(line_profit) AS total_profit,
       ROUND(SUM(line_profit)/NULLIF(SUM(net_amount),0)*100, 2) AS margin_pct
FROM marts.fact_orders
WHERE is_completed = true
GROUP BY 1, 2 ORDER BY total_profit DESC LIMIT 10;

Documentation

Results

  • Processes 53 order line items across 30 orders and 20 customers
  • 35+ automated tests — all passing
  • Complete lineage from raw CSV to analytics-ready star schema
  • Less than 30 second full build time

Author

Built as a portfolio project demonstrating analytics engineering best practices.

About

End-to-end analytics engineering project using dbt and Snowflake. Transforms raw e-commerce data through staging, intermediate, and marts layers into a Kimball star schema with fact and dimension tables. Includes automated data quality testing, surrogate key generation, and CI/CD deployment.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors