Self-Supervised Learning for Automated Fracture Detection in Radiographic Images

Perera, Lahiru

dc.contributor.author	Perera, Lahiru
dc.date.accessioned	2026-03-11T09:11:45Z
dc.date.available	2026-03-11T09:11:45Z
dc.date.issued	2025
dc.identifier.citation	Perera, Lahiru (2025) Self-Supervised Learning for Automated Fracture Detection in Radiographic Images. Msc. Dissertation, Informatics Institute of Technology	en_US
dc.identifier.issn	20233145
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/2949
dc.description.abstract	Bone fractures are a critical medical condition requiring early and accurate detection to ensure timely treatment, yet conventional analysis relies on manual interpretation, which is time- consuming and prone to error (Sharma et al., 2025). While deep learning models have been applied to this problem, they often struggle with the small and complex datasets typical in medical imaging, leading to unreliable results (Alwzwazy et al., 2025). The primary limitations hindering their clinical adoption are a heavy dependence on large, expert-annotated datasets, which are difficult to acquire, and a lack of model transparency, which erodes clinical trust (Alwzwazy et al., 2025). To address this gap, this research designed and developed a novel, end-to-end fracture detection system built upon a Vision Transformer (ViT) architecture. This approach diverges from traditional supervised methods by leveraging a domain-specific Self-Supervised Learning (SSL) strategy, a technique that has shown significant potential to enhance clinical diagnostics (Wang and Siddiqui, 2024). The core of the solution is a two-stage process. First, a standard ViT- Base/16 encoder was pre-trained on a large corpus of unlabeled musculoskeletal radiographs from the MURA dataset using a Masked Autoencoder (MAE) framework. This forces the model to learn rich, high-level semantic features of radiographic anatomy without requiring any human-provided labels. Subsequently, this pre-trained encoder was fine-tuned for fracture classification using a smaller, labeled subset of the MURA dataset. The developed SSL-ViT model was systematically evaluated on a hold-out validation set, demonstrating the viability and effectiveness of the proposed approach. The system achieved a validation accuracy of 86.90% and an Area Under the Curve (AUC-ROC) score of 0.8686, indicating a strong capability to distinguish between fractured and non-fractured cases. Analysis of the training dynamics confirmed that the SSL pre-training provided a robust foundation, enabling the model to learn effectively from limited labeled data. These results validate that the combination of domain-adaptive self-supervised learning with Vision Transformers presents a promising pathway toward creating more data-efficient, accurate, and trustworthy AI tools for clinical diagnostics.	en_US
dc.language.iso	en	en_US
dc.subject	Fracture Detection	en_US
dc.subject	Self-Supervised Learning	en_US
dc.subject	Vision Transformer	en_US
dc.title	Self-Supervised Learning for Automated Fracture Detection in Radiographic Images	en_US
dc.type	Thesis	en_US