Abstract
The growing accessibility of technology has not only facilitated customer engagement with financial services but has also generated vast amounts of data, creating opportunities for criminals to conceal illicit activities. Consequently, detecting money laundering has become increasingly challenging, as fraudulent transactions often blend into legitimate ones. Traditional machine learning models struggle with these evolving patterns, particularly when confronted with highly imbalanced datasets where fraudulent transactions are rare. In this paper, we propose a machine learning approach to accurately detect money laundering. We conducted three experiments, beginning with class imbalance mitigation through various resampling methods. Notably, the SMOTE + Edited Nearest Neighbours (ENN) method significantly outperformed the No Resampling approach. Subsequent experiments applied this top- performing technique across models, including K-Nearest Neighbors (KNN), XGBoost, and Random Forest, leading to the development of an Optimized Ensemble Model (OE-XRK). Our proposed model, which combines XGBoost, Random Forest, and KNN, demonstrated superior performance compared to both traditional methods and non-optimized ensemble approaches across key metrics such as F1 Score, Accuracy, Precision, Recall, and ROC AUC. Our findings highlight the critical role of effective data preprocessing and model optimization in financial fraud detection. This research provides valuable insights for practitioners and researchers, emphasizing the necessity of sophisticated machine learning techniques and advanced anomaly detection methods. Future work may focus on further optimizing ensemble fraud detection models for enhanced scalability and adaptability in real-world scenarios.