The goal of this project was to clean and analyze the ecommerce data from multiple csv files. Key objectives were to clean the raw data, create reusable views to maintain orginal data, answering the business questions like top selling products, visitor preferences and geographical revenue.
- Imported 5 csv file to postpresql
- Created cleaned views
- Identified key relationships
- Created ERD
- Used sql queries to answer business questions
- Some cities like, "US Other", "San Francisco", "Sunnyvale","Atlanta" had highest revenue
- Some countries like "United States", "United Kingdom" had clear preferences for "Home/Shop by Brand/YouTube/"
- Top selling products are "17oz Stainless Steel Sport Bottle", "Ballpoint LED Light Pen", "22 oz Bottle Infuser"
- Missing or null values
- Duplicate data in the csv files
- Scaling amount values by dividing them by 1000000
- Partial matches in the joins
Creating visual outputs