Machine Learning Series

A progressive series of machine learning projects in Python, covering the full supervised and unsupervised learning workflow - from exploratory data analysis through classification, regression, and clustering.

Author: J. Wong

Projects

1. Exploratory Data Analysis

Dataset: Heart Disease - Kaggle (Neurocipher)

Comprehensive EDA on a 270-patient heart disease dataset. Covers data cleaning, correlation analysis, feature engineering, PCA, and frequentist hypothesis testing across four hypotheses.

Topics: Correlation heatmaps · Polynomial & deviation features · PCA (11 components → 95% variance) · t-tests · ANOVA · Chi-square

2. Classification

Dataset: Mobile Price Classification - Kaggle (A. Sharma)

Benchmarks 10+ classification algorithms on a 4-class mobile pricing problem, with full hyperparameter tuning and model interpretability analysis.

Topics: Logistic Regression · KNN · SVM · Decision Trees · Random Forest · Gradient Boosting · XGBoost · LIME · Permutation Importance

3. Regression

Dataset: Student Performance - Kaggle (N. Narayan)

Validates regression assumptions and systematically evaluates regression techniques with cross-validation and regularization. Achieves R² ≈ 0.989.

Topics: Linear & Polynomial Regression · Lasso (L1) · Ridge (L2) · K-Fold Cross-Validation · GridSearchCV · Pipeline

4. Unsupervised Learning

Datasets: Spotify Tracks - Kaggle (MaharshiPandya) · Sign Language MNIST - Kaggle (tecperson)

Applies dimensionality reduction and clustering to both tabular and image data, including pixel-level image segmentation and t-SNE visualization of hand sign embeddings.

Topics: PCA · KernelPCA · t-SNE · KMeans · DBSCAN · MeanShift · Agglomerative Clustering · Gaussian Mixture Models

Acknowledgments

Methods and workflow inspired by the IBM Machine Learning Professional Certificate on Coursera.

Technologies

Python scikit-learn pandas NumPy Matplotlib Seaborn Plotly XGBoost SciPy Statsmodels OpenCV KaggleHub

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Classification		Classification
Exploratory Data Analysis		Exploratory Data Analysis
Regression		Regression
Unsupervised Learning		Unsupervised Learning
assets		assets
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Series

Projects

1. Exploratory Data Analysis

2. Classification

3. Regression

4. Unsupervised Learning

Acknowledgments

Technologies

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Series

Projects

1. Exploratory Data Analysis

2. Classification

3. Regression

4. Unsupervised Learning

Acknowledgments

Technologies

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages