Loan Eligibility Prediction using Machine Learning

Machine Learning

Tech Stack

Python
Machine Learning
Numpy
Pandas
Matplotlib
ScikitLearn

Description

This project focuses on predicting loan approval status using applicant financial and demographic data. A comprehensive machine learning pipeline was developed to preprocess data, train multiple ML models, and evaluate their performance.

  • Designed a data preprocessing pipeline using Pandas: Handled missing values, Encoded categorical variables, Applied log transformations and normalized numerical features.
  • Implemented and compared multiple ML models: Logistic Regression, Random Forest, Decision Tree, K-Nearest Neighbors (KNN), Support Vector Machine (SVM).
  • Evaluated model performance using accuracy, F1-score, and confusion matrices with Scikit-learn.
  • Conducted feature analysis and visualized: Feature importance (CreditHistory identified as the most critical predictor) & Model metrics, distributions, and relationships using Matplotlib and Seaborn.
  • Achieved 78% accuracy using the Random Forest model, delivering a reliable tool for financial institutions to make data-driven loan approval decisions.