Turkish Journal of Electrical Engineering and Computer Sciences
DOI
10.3906/elk-1809-190
Abstract
A default loan (also called nonperforming loan) occurs when there is a failure to meet bank conditions and repayment cannot be made in accordance with the terms of the loan which has reached its maturity. In this study, we provide a predictive analysis of the consumer behavior concerning a loan?Äôs first payment default (FPD) using a real dataset of consumer loans with approximately 600,000 records from a bank. We use logistic regression, naive Bayes, support vector machine, and random forest on oversampled and undersampled data to build eight different models to predict FPD loans. A two-class random forest using undersampling yielded more than 86 % on all performance measures: accuracy, precision, recall, and F1-score. The corresponding scores are even as high as 96% for oversampling. However, when tested on the real and balanced dataset, the performance of oversampling deteriorates as generating synthetic data for an extremely imbalanced dataset harms the training procedure of the algorithms. The study also provides an understanding of the reasons for nonperforming loans and helps to manage credit risks more consciously.
Keywords
achine learning, default loan, first payment default, imbalanced class problem, oversampling, undersampling
First Page
167
Last Page
181
Recommended Citation
KOÇ, UTKU and SEVGİLİ, TÜRKAN
(2020)
"Consumer loans' first payment default detection: a predictive model,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 28:
No.
1, Article 12.
https://doi.org/10.3906/elk-1809-190
Available at:
https://journals.tubitak.gov.tr/elektrik/vol28/iss1/12
Included in
Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons