Digital Repository

Machine Learning Based Fraud Detection In Health Insurance

Show simple item record

dc.contributor.author Dharmavijaya, Thilanka
dc.date.accessioned 2025-07-01T03:23:34Z
dc.date.available 2025-07-01T03:23:34Z
dc.date.issued 2024
dc.identifier.citation Dharmavijaya, Thilanka (2024) Machine Learning Based Fraud Detection In Health Insurance. MSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 2018561
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/2796
dc.description.abstract "Insurance fraud is a growing problem in the economic sector, distinguished by its constantly changing nature and growing complexity. As fraudsters continue to develop new strategies, conventional methods of detection are proving insufficient, requiring the implementation of more efficient and creative alternatives. This study enhances the precision and adaptability of insurance fraud detection by applying an approach based on machine learning. The study suggests a new model that takes advantage of the natural data imbalance in fraud detection scenarios. This model not only improves the rates of identifying fraud but also offers valuable insights into how fraud processes work. This study aims to address the current limitations in technique and make a significant theoretical and practical contribution to the field of fraud prevention by combining domain-specific knowledge and utilizing advanced algorithms. The main objective of this study is to employ machine learning techniques for the purpose of detecting fraudulent activities in the insurance industry. At first, an Exploratory Data Analysis (EDA) was performed to get insight into the properties of the dataset. In order to tackle the issue of imbalanced data, where the number of fraudulent claims is substantially lower than legitimate ones, oversampling techniques were utilized to properly balance the dataset. After performing data preprocessing, process included transforming categorical variables into one-hot encoding. Additionally, the dataset was split into distinct training and testing sets. Subsequently, a range of machine learning techniques were employed, such as Random Forest, Decision Tree, and Support Vector Machine (SVM). Out of all them, the highest level of accuracy achieved was 0.87. The study improved models' performance by fine-tuning hyperparameters, addressing data imbalance and reducing fraudulent claims compared to authentic ones." en_US
dc.language.iso en en_US
dc.subject Insurance Fraud Detection en_US
dc.subject Machine Learning Techniques en_US
dc.subject Anomaly Detection en_US
dc.title Machine Learning Based Fraud Detection In Health Insurance en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account