Fraud detection mechanism for mobile money transactions using machine learning techniques

Midigaspege, Thimali

Home
→
Dissertations & Thesis
→
MSc Business Analytics
→
2022
→
View Item

Fraud detection mechanism for mobile money transactions using machine learning techniques

Midigaspege, Thimali

URI: http://dlib.iit.ac.lk/xmlui/handle/123456789/1461

Date: 2022

Abstract:

"Mobile money transactions have been rapidly growing all with universal presence of mobile phones. Using mobile phones people can perform unbanked transactions which is quick and easy. With the overwhelming growth in mobile transactions, fraudulent activities have also grown in a significant rate. Even though popularity of mobile transactions is beyond the predictions still mobile devices, apps and service providers does not provide 100% secured systems and security. Therefore, it is crucial to construct highly effective fraud detection mechanism for mobile money transactions. However, there are very few literature studies related to mobile money fraud detection where other financial fraud detections are well studied and guided. This can be due to the novelty of the technology, rapid growth and limited available data sets in the field. To address this gap, this research is conducted with objective of constructing highly effective fraud detection mechanism using machine learning techniques. A publicly available synthetic PaySim data set has been used for the research which is generated in 2016 using a mobile service provider in Africa, and transactions have been collected over one month. Dataset includes 6,320,620 transactions where only 8213 fraud transactions are included. This makes it a highly imbalanced data set which will be a huge barrier when using classification algorithms. Classification algorithms tend to be biased for majority class when imbalance data set is present. Resampling techniques and resemble methods have been used in the research to mitigate this limitation. Synthetic minority OverSampling technique (SMOTE) and hybrid resampling technique SMOTE-Tomek Links removal methods are used. Ninety Six models have been designed in combination with three data sets (imbalance, SMOTE, SMOTE-Tomek Links removal), three different feature scale methods and eight different classification algorithms, namely: logistic regression, random forest, support vector machine, decision trees, naïve bayes, gradient boosting and ada boost. The performance of the models has been evaluated using confusion matrix, precision, recall, f1 score, ROC_AUC, and execution time. The experimental resulted that random forest classifier is a highly effective model in combination with SMOTE-Tomek Links removal and minmax scaler. "

Show full item record