| dc.description.abstract |
Diabetic retinopathy (DR) is a long-term eye disease that, if left unattended, leads to
irreversible loss of vision. Visual inspection of DR stages from retinal fundus images is time
consuming, subjective, and error-prone due to the fine gradations across stages. Early and
proper detection is crucial in clinical settings for timely intervention. The core problem
addressed within this project is the automatic staging of DR into its five stages using deep
learning, resolving the problems of class imbalance, interpretability of the model, and
computational effectiveness.
This was achieved by using a hybrid deep learning model based on EfficientNetB3 as the base
architecture, augmented with a custom Convolutional Block Attention Module (CBAM). The
CBAM layer was added after the convolutional feature extractor to amplify feature maps using
spatial and channel-wise attention mechanisms. The dataset was rebalanced with targeted
oversampling to address class imbalance. In addition, a focal loss function was utilized to
increase the penalty imposed on simple examples and focus learning on harder-to-classify
samples. The model was deployed using TensorFlow and was trained in two phases: initially
with frozen convolutional layers and then with fine-tuning enabled. Preprocessing involved
resizing, normalization, and data augmentation to improve generalization.
The model was validated using various data science performance metrics. On the test set, it
gave an overall test accuracy of 77.42%, macro-ROC AUC score of 0.9414, and 84% training
accuracy, demonstrating excellent multi-class classification performance. Accuracy per class
was over 98% for the most severe DR stages (classes 3 and 4), confirming the robustness of
the model in identifying critical cases. The results confirm again the efficiency of applying
EfficientNetB3 using attention mechanisms and focal loss for medical imaging purposes. |
en_US |