This project mirrors a real enterprise warehouse data platform and showcases depth across,
* Data Engineering/ Data analytics (DE/DA)
* Data Modeling & Power BI Development
* Operational KPI/ SLA Frameworks
* End-to-end FMCG, Warehouse, Logistics and QC Domain Knowledge
π Interact with the Online Dashboard
π¦ Watch the Youtube Video
WIAP is a full-stack data engineering + analytics + warehouse operations intelligence platform designed to simulate and analyze real FMCG/3PL warehouse environments.
This project is basically a warehouse digital twin:
- Generates complex synthetic operational data (Inbound, Outbound, Inventory, Transport, Quality, HR).
- Loads everything into a PostgreSQL warehouse with a fully normalized schema.
- Creates views powered by heavy SQL, data cleaning, imputations, and logic transformations.
- Builds an analytic semantic layer in Power BI with M-ETL pipelines and optimized schemas.
- Delivers 70+ operational KPIs used in real warehouse management.
Everything mirrors real industry workflow learned from 9+ years working in food FMCG QA, warehousing, and logistics.
- Build a realistic, scalable warehouse data ecosystem.
- Showcase analytics engineering + data engineering workflow end-to-end.
- Create a modular platform that supports future ML and forecasting.
- Demonstrate strong BI concepts: modeling, KPI governance, DAX standards.
- Highlight my operational intelligence from QA + FMCG + logistics.
- Design enterprise-grade schema (normalized + views).
- Create multi-domain synthetic data with natural randomness.
- Build reproducible ETL logic.
- Develop BI dashboards that mimic real 3PL/WH KPIs/ SLAs.
- Create documented KPI dictionary for governance.
A Kanban board is included to track:
- Data generation tasks
- Schema iterations
- Loader fixes
- View redesigns
- Power BI modeling
- KPI validation
- Future roadmap (Phase II Ops)
Progress tracking with GitHub Projects - Kanban board
| Category | Tools |
|---|---|
| Python & Data Generation | |
| LLM Integration | |
| Database Connectivity | |
| SQL Database Management |
| Category | Tools |
|---|---|
| Power Platform & Visualization |
| Category | Tools |
|---|---|
| Version Control, Project Tracking & Documentation |
| Category | Tools |
|---|---|
| AI Assistance & Creative Tools |
Data_Analytics_Projects_Warehouse_Process_Analysis_Pipeline/
βββ data/
β βββ raw/ # Raw data generated (Python Libraries + LLM)
β
βββ src/ # Production-ready Python codes
β βββ data_generator.py # LLM functions and DataFrame creation logic
β βββ data_loader.py # Logic for loading data from CSVs into the PostgreSQL database.
|
βββ sql/ # PSQL scripts
β βββ schema.sql # CREATE TABLE statements for the database schema
|
βββ KPI_docs/ # Extensions to the main README.md to expand KPIs
β βββ KPI_COO.md # KPIs in COO's view
β βββ KPI_Inbound.md # KPIs in Inbounds & Returns Page
β βββ KPI_Outbound.md # KPIs in Outbounds Page
β
βββ reports/ # Final reports and visualizations
β βββ project_doc.docx.py # Project Report
β βββ project_video.mp4 # Dashboard/ Report walkthrough
β βββ Operations_Dashboard_P01.pbix # Data cleaning, data modeling, data analysis, visualization and publish
β
βββ images/ # All relevant image files
|
βββ LICENSE.md # MIT License
βββ .gitignore # Files and folders to ignore in Git.
βββ README.md # Project documentation
π§ Idea β π¨ Design β π ETL β π Analyze β ποΈ Dashboard β π Results
Python + VS Code - data_generator.py
- Python-generated synthetic datasets
- SQL-first normalized schema (PK/FK, indexes)
- Data cleaning via SQL views
- ETL pipeline using SQLAlchemy
- Power BI data modeling & measure tables
- Department-wise KPI models
PostgreSQL + VS Code - schema.sql
- 4 standalone dimension tables
- 10 dependent operational tables
- 2 monitoring/incident tables
- 16 analytics-ready views
- Full PK/FK relationships
- Indexes for query performance
The schema follows a Raw β Clean Views β PBI ETL β BI Model architecture.
Python + VS Code - data_loader.py
- FK-safe load sequence
- UPSERT logic (
ON CONFLICT) - Automated logging
- Idempotent reruns
PostgreSQL + DBeaver - views.sql
- LLM hallucination corrections
- Missing value imputation
- NaN β TRUE logic conversions
- Dimensional transformations
- Standardized naming conventions
- RegEx cleanup
- Time-casting, type standardization
- Derived KPIs (cycle times, severities, statuses)
- Normalization of messy logs
Power Query Editor:
- Data profiling
- Quality checks
- Column-level lineage
- Conditional transformations
- Metadata management
- Governance patterns
- Versioned query groups
- Staging β Clean β Fact β Dim layering
Model highlights:
- Complex-schema with clean relationship directions
- Row-level granularity by operation
- Model optimization:
- Field parameter grouping
- Surrogate keys
- Removing high-cardinality clutter
- Merged fact tables
- Department-wise measure tables
- KPI folders for governance
Each KPI includes:
- Business Question
- Formula
- Importance
- Operational Meaning (High vs Low)
- How to Improve
π COO's Dashboard (section wise) - COO's KPI Dictionary π
β Revenue, Profit, CBM flows
β Workforce demographics
β Warehouse utilization
β All operational KPIs summarized
π Inbound/ Retunrs KPIs - Inbound/ Returns KPI Dictionary π
β Labour efficiency
β Shift productivity (Inbound, Returns)
β Operational Cycle times (Picking, Loading, Return handling)
β On-time put-away %
β Rejection % analyses
β Supplier performance
β Return behaviors
β Incident reporting
π Outbound KPIs - Outbound KPI Dictionary π
β Labour efficiency
β Shift productivity
β Order fulfillment %
β Operational cycle time
β WH Throughput (Cartons, CBM, Pallet)
β Failed-pick product analysis
β Lost GP due to failed-picks
β Vehicle utilization
β On-Time-Dispatch (OTD) %
- Inventory Control analytics
- Quality Control analytics
- Transport/ Logistics analytics
- Integrate Sales data model to perform a financial analysis
- Predictive analysis with Machine Learnig models
π Shall we explore how to run the WIAP π ..?
# 1. Clone the repo
git clone https://github.com/<your-username>/wiap.git
cd wiap
# 2. Install dependencies
pip install -r requirements.txt
# 3. Start PostgreSQL (Docker)
docker-compose up -d
# 4. Generate synthetic datasets
python data_generation/data_generator.py
# 5. Load data into the DW
python etl/data_loader.py
# 6. Open Power BI Desktop and proceed with your own analysis and visualization
.pbix file is not inludedπ Want to commit π ..?
feat: added supplier rejection logic
fix: corrected on-time putaway calculation
docs: updated KPI dictionary
refactor: optimized SQL view joins
test: added loader unit tests
chore: updated requirements.txt
π Want to explore how you can contribute π ..?
1. Fork the repo
2. Create a feature branch
3. Follow commit conventions
4. Ensure tests pass
5. Submit PR with:
* What changed
* Why it was needed
* Any dependencies
* Screenshots (if Power BI)
π Would you like test it π ..?
- Column issues
- Null handling
- Pattern consistency
- Business rule checks
- PK/FK constraints
- UPSERT validation
- Row counts
- Error handling
- Data cleaning logic
- COALESCE strategy
- Cycle time calculations
- SLA logic correctness
WIAP isnβt a toy project. Itβs a full-fledged warehouse intelligence platform demonstrating,
- Data engineering abilities
- Analytics engineering discipline
- Business logic modeling
- Dashboard design
- KPI governance
- Operational domain knowledge
My sincere thanks to the communities and resources that supported this learning journey:
- eLearning.lk: I was fortunate to find this online edu. platform at the start of my learning path. Special thanks to Mr. Sanjaya Elvitigala, the platform owner, and Mr. Asanka Senarath, my first Power BI mentor.
- YouTube Communities: For exploring best practices in KPI representation and drawing inspiration for user interface design.
- AI Assistants (Grok, ChatGPT, DeepSeek): For researching concepts, validating ideas, developing KPI/SLA frameworks, and assisting with debugging and code optimization.
Thilina Perera | Data with TP
π Data Science/ Data Analytics D-Technosavant
π Machine Learning, Deep Learning, LLM/LMM, NLP, and Automated Data Pipelines Inquisitive
This project is licensed under the MIT License.
Free to use and extend.






