Actual vs. Fake Job Posting Prediction: A Machine Learning Approach

Mubarak, Muzni

Home
→
Dissertations & Thesis
→
MSc Business Analytics
→
2024
→
View Item

dc.contributor.author	Mubarak, Muzni
dc.date.accessioned	2025-07-02T04:15:34Z
dc.date.available	2025-07-02T04:15:34Z
dc.date.issued	2024
dc.identifier.citation	Mubarak, Muzni (2024) Actual vs. Fake Job Posting Prediction: A Machine Learning Approach. MSc. Dissertation, Informatics Institute of Technology	en_US
dc.identifier.issn	20221860
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/2846
dc.description.abstract	"This study dives into the complexities of fraud job detection in imbalanced datasets, a common problem in machine learning (ML) applications. Using a large dataset of fake and genuine job advertisements, the study examines the effectiveness of several machine learning algorithms in detecting fraudulent behaviours, specifically in fraud detection of fake job advertisements. ML classifiers such as Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), K-Nearest Neighbors (KNN), Logistic Regression (LR), and Neural Networks (NN) are rigorously examined, with a particular emphasis on performance criteria such as accuracy, precision, recall, and F1 score. Furthermore, the study uses the Receiver Operating Characteristic (ROC) curve analysis to assess the models' performance and capabilities. To address the imbalance between fake and legitimate cases in the dataset, the study investigates the use of oversampling strategies to reduce bias and improve the classifiers' prediction capability. Through thorough research, Neural Networks emerge as the most promising classifier with higher accuracy rates amidst class imbalance. Notably, the use of oversampling approaches, such as the Synthetic Minority Over-sampling Technique (SMOTE) or the Adaptive Synthetic Sampling Method (ADASYN), results in significant improvements in classifier performance measures. Despite advances in detection accuracy, precision, recall, and F1 score, the study recognises the limits of working with imbalanced datasets. Challenges remain in ensuring optimal performance across all classes, especially when the minority class is substantially underrepresented. Furthermore, relying solely on traditional evaluation criteria such as accuracy and precision may fail to convey the intricacies of classifier performance. To address these constraints, the report proposes several alternatives for future research These include developing advanced oversampling strategies, improving assessment metrics to reflect better model performance in imbalanced classes, and trying out research on ensemble learning methodologies. By adopting these alternatives, research methods will be able to mitigate the difficulties of imbalanced datasets better, paving the way for future fraud detection systems that are more robust and efficient."	en_US
dc.language.iso	en	en_US
dc.subject	Machine Learning	en_US
dc.subject	Neural networks	en_US
dc.subject	Fraud detection	en_US
dc.title	Actual vs. Fake Job Posting Prediction: A Machine Learning Approach	en_US
dc.type	Thesis	en_US