| dc.description.abstract |
Misidentification of medication still has a serious effect on healthcare, aggravated by language barriers and looking similar medications. Current identification systems are unusable systems in diverse environments. To improve accessibility and safety, this project answers what is needed by building a pill recognition web application that is accurate and reliable.
This proposed solution takes advantage of a Hybrid Machine Learning Model (Feature Fusion), consisting of Vision Transformers (Transformer) and EfficientNet (CNN), to capture both the detailed and contextual features from pill images. It adapts to various lighting and backgrounds through different techniques. Model reliability is improved with a structured training process consisting in transfer learning, hyperparameter tuning and finally the hybrid feature fusion.
Multilingual access and community engagement features with broad usability is provided with a user-friendly interface. Testing on the binary ResNet model achieved 99.81% accuracyfor pill vs. non-pill detection, while the hybrid ViT-EfficientNetB3 multi-class modelreached 99.16% accuracy, 0.99 precision, 0.99 recall, and 0.99 F1-score across 20 pill types. These results, supported by confusion matrices and benchmarking against standalone CNNs and Transformers, confirm robust performance. |
en_US |