Skip to content

juni37/CSCA5622

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Enhancing Customer Loyalty through Predictive Churn Analysis

Course Overview

This project is part of CSCA 5622: Introduction to Machine Learning: Supervised Learning, a course offered by CU Boulder, where I earned an A- grade and 3 quarter credits. The course covered:

  • Theoretical and practical foundations of supervised learning using Python and Jupyter Notebook.
  • Machine learning models such as Linear Regression, Logistic Regression, Decision Trees, KNN, Ensembles, and Support Vector Machines (SVM).

Key Learning Outcomes:

  • Proficient use of modern machine learning tools and Python libraries.
  • Understanding of methods to address linearly inseparable data.
  • Comparison of the strengths and weaknesses of various supervised learning models.
  • Insights into ensemble methods and kernel techniques.

Project Summary

Objective: Develop a predictive churn analysis model for banking institutions to enhance customer loyalty by analyzing customer behavior and identifying churn risks.

Significance: Customer churn is a critical concern for financial institutions, leading to revenue loss and high acquisition costs for new customers. Retaining customers is more cost-effective than acquiring new ones, making predictive churn analysis a vital business tool.

Project Components

Introduction

The project aims to address customer retention challenges in banking by leveraging supervised learning techniques. Using real-world customer data, the analysis identifies key churn predictors and builds models to classify customers based on churn risk.

Methodology

  1. Data Preparation:

    • Data cleaning and exploratory data analysis (EDA) to identify trends and outliers.
    • Feature engineering to enhance model performance.
  2. Model Development:

    • Multiple machine learning models were implemented, including:
      • Logistic Regression
      • Decision Trees
      • K-Nearest Neighbors (KNN)
      • Ensemble methods like Random Forest and Gradient Boosting
      • Support Vector Machines (SVM)
    • Hyperparameter tuning for model optimization.
  3. Evaluation Metrics:

    • Accuracy, Precision, Recall, and F1-Score were used to assess model performance.
    • Confusion matrix analysis provided insights into classification errors.

Conclusion

The project demonstrated the potential of machine learning to address customer churn effectively. The optimized models can assist banking institutions in predicting churn risks and implementing proactive retention strategies, ultimately reducing revenue loss and improving customer satisfaction.

Models Used

  • Logistic Regression: To model the probability of customer churn.
  • Decision Trees: For intuitive classification and insight into feature importance.
  • K-Nearest Neighbors: To classify based on similarity metrics.
  • Ensemble Methods: Leveraging Random Forest and Gradient Boosting for robust predictions.
  • Support Vector Machines: For handling non-linear decision boundaries using kernel methods.

Acknowledgments

I would like to thank Dr. Geena Kim, the instructor for CSCA 5622, for her guidance throughout the course. This project benefited from the theoretical and practical insights gained from the course material and assignments.

Future Work

  • Extend the analysis to include additional features like customer demographics and transaction histories.
  • Explore deep learning approaches for churn prediction.
  • Integrate the model into a real-time banking application for predictive analytics.

Repository Contents

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors