An Ensemble Learning Approach for Improved Loan Fraud Detection: Comparing and Combining Machine Learning Models

No Thumbnail Available
Anuradha, Anuradha
Issue Date
MSc in Data Analytics
Dublin Business School
This thesis investigated loan fraud detection using advanced machine learning techniques, focusing on Logistic Regression, Random Forest, AdaBoost, Long Short-Term Memory (LSTM) networks, and Convolutional Neural Networks (CNN). The study emphasized the importance of feature selection, and explored forward, backward, and automatic methods to improve model performance. Comparative analysis across models revealed that Random Forest consistently outperforms other models in accuracy and efficiency, regardless of the feature selection technique. AdaBoost showed consistent results but at a higher computational cost, while LSTM and CNN were highly sensitive to the choice of feature selection, affecting their performance significantly. The thesis concluded that feature selection was vital for optimizing machine learning models for fraud detection, with the impact varying significantly across different algorithms. Random Forest emerged as a robust and efficient model for fraud detection, adaptable to various applications. The findings underscored the potential of machine learning to strengthen financial security and trust.