Abstract:
In the realm of banking and financial institutions, the pivotal role of collecting public deposits 
and disbursing funds through various loan products underscores the importance of managing 
loan portfolios efficiently. Non-performing Loans (NPLs), characterized by continuous 
outstanding balances over four installments or 120 days, pose significant risks to financial 
stability. This study focuses exclusively on customer-related factors influencing NPLs within 
ABC Finance PLC, a leading non-banking financial institute in Sri Lanka, specifically analyzing 
vehicle loan data from the previous fiscal year. 
The objective of this research is to employ Machine Learning techniques to predict the likelihood 
of loans becoming NPLs based on customer, loan, vehicle, and account-related details. The study 
utilizes secondary data sourced from the lending database, employing a training dataset split in 
an 80:20 ratio for model development and testing. Key steps include preprocessing, exploratory 
data analysis, feature engineering, and model selection. 
Several Machine Learning models, including Naïve Bayes, K-Nearest Neighbors, Logistic 
Regression, Random Forest, Gradient Boosting Classifier, XG Boosting Classifier, and Decision 
Tree, are trained and evaluated using metrics such as Confusion Matrix, Accuracy, Precision, 
Recall, and F1 score. The findings from this analysis aim to empower responsible stakeholders, 
including recovery managers and branch heads, to proactively manage loan portfolios and 
mitigate NPL risks effectively. 
The analysis of machine learning models on the vehicle loan dataset from ABC Finance PLC 
yielded valuable insights into model performance and predictive capabilities. Among the 
evaluated models, XGBoost emerged as the top-performing algorithm, particularly after 
hyperparameter tuning. Post-tuning, XGBoost achieved the highest test accuracy of 0.7221 and 
an impressive Area Under the ROC Curve (AUC) of 0.8, demonstrating enhanced discriminatory 
power and predictive accuracy compared to other models. 
The significant improvement observed in XGBoost's performance highlights the effectiveness of 
hyperparameter tuning in optimizing model parameters and enhancing predictive capabilities. 
This finding underscores the importance of leveraging advanced machine learning techniques, 
such as XGBoost, to proactively identify and manage non-performing loans within financial 
institutions. 
In conclusion, the study's findings provide actionable insights for stakeholders at ABC Finance 
PLC to implement data-driven strategies for loan portfolio management and risk mitigation. By 
leveraging Machine Learning models like XGBoost, financial institutions can enhance decision
making processes, optimize resource allocation, and minimize the impact of non-performing 
loans on financial stability. Future research directions may focus on incorporating additional data 
sources and advanced modeling techniques to further improve predictive performance and 
address evolving challenges in loan portfolio management.