Unleashing class imbalance problem in loan dataset through a novel oversampling approach based on FCM
dc.contributor.author | Akter, Subrina | |
dc.date.accessioned | 2025-01-04T08:21:51Z | |
dc.date.issued | 2023-12 | |
dc.description | Vol.-1, Issue-1, December 2023, pp. 61-82 | |
dc.description.abstract | Commercial companies highly depend on loan approval models trained by machine learning and statistical methods to predict loan status. However, imbalanced datasets present a key challenge in this sector. Addressing this issue, this paper proposes a new oversampling method based on Fuzzy C means clustering. This clustering algorithm assigns the instances to several groups by assigning a flexible degree of membership. K-Nearest Neighbor and the Decision Tree served as the basic classifiers in an extensive test on the Kaggle loan dataset. Three distinct imbalanced ratios—2.2, 4.2, and 8.45—were used in the experiment. The effectiveness of the recommended strategy was compared with SMOTE and WBOT using 5-fold CV. The outcomes showed that the proposed method outperformed both SMOTE and WBOT, obtaining higher average F-measure and G-mean values across the machine learning algorithms. These findings show how the suggested method may correct class imbalance while also enhancing prediction accuracy in the context of loan acceptance. | |
dc.description.sponsorship | Department of Computer Science and Engineering International Islamic University Chittagong | |
dc.identifier.issn | 3005-5873 | |
dc.identifier.uri | http://dspace.iiuc.ac.bd/handle/123456789/8479 | |
dc.language.iso | en | |
dc.publisher | CRP, International Islamic University Chittagong | |
dc.subject | Decision tree | |
dc.subject | F-measure | |
dc.subject | G-mean | |
dc.subject | Imbalanced dataset | |
dc.subject | KNN | |
dc.subject | Loan approval | |
dc.subject | Machine learning | |
dc.title | Unleashing class imbalance problem in loan dataset through a novel oversampling approach based on FCM | |
dc.type | Article |