| dc.contributor.author | Haupe Liyanage, Saveen Dinuwara | |
| dc.date.accessioned | 2026-04-08T06:25:57Z | |
| dc.date.available | 2026-04-08T06:25:57Z | |
| dc.date.issued | 2025 | |
| dc.identifier.citation | Haupe Liyanage, Saveen Dinuwara (2025) IntegraBNER: Integrating Multimodal Data Sources for Enhanced Biological Named Entity Recognition. BSc. Dissertation, Informatics Institute of Technology | en_US |
| dc.identifier.issn | 20210180 | |
| dc.identifier.uri | http://dlib.iit.ac.lk/xmlui/handle/123456789/3143 | |
| dc.description.abstract | Biomedical studies and clinical reporting often involve enormous amounts of unstructured text data and diagnostic images, which pose the challenge of extracting informative biological named entities such as disease, anatomical site, and clinical findings. Traditional Named Entity Recognition (NER) models are directed toward textual data only, which fails to recognize the potentially crucial visual information embedded in radiologic imaging. This research, IntegraBNER, bridges this gap by suggesting a multimodal NER solution that integrates radiology reports and chest X-ray images for more accurate and context-aware entity recognition in the biomedical field. IntegraBNER integrates Bio_ClinicalBERT for extracting text features and ResNet-50 for visual representation of the chest X-rays. A fusion model having an attention-based MLP classifier integrates the two modalities to perform multi-label biomedical entity prediction. The system is trained and evaluated on the OpenI Indiana Chest X-ray dataset and automatically labeled using a hybrid autolabeling strategy with synonym dictionary matching and the output of two fine-tuned BioBERT NER models for better coverage. To test usability and interpretability, IntegraBNER offers a Gradio-based GUI for users to type in a radiology report and/or an X-ray image. The UI returns highlighted entities and a tabular output with predicted terms, confidence values, and entity types from a carefully curated vocabulary. IntegraBNER has been successfully hosted in Hugging Face Spaces to facilitate public accessibility and expert validation. Performance analysis exhibits excellent performance in text-based prediction (micro F1- score: 0.94) and illustrates the complementary nature of visual information. While fusion- based accuracy is still being improved (F1-score: 0.77), high recall on common medical conditions like pneumothorax and pleural effusion is achieved. Error analysis in depth revealed that prediction over rare entities and weak image features require improvement in fusion techniques and labeling depth. Expert evaluation conducted through a guided form validated the novelty and utility of the solution, highlighting its promise in practical applications in radiology-based decision support and biomedical text mining. Subject Descriptors (from ACM) Computing methodologies → Machine learning → Neural networks → Deep learning Information systems → Information retrieval → Text mining → Named entity recognition Applied computing → Life and Medical sciences → Computational Biology → Bioinformatics | en_US |
| dc.language.iso | en | en_US |
| dc.subject | Biological Named Entity Recognition | en_US |
| dc.subject | Multimodal Data Fusion | en_US |
| dc.subject | Radiology Reports | en_US |
| dc.title | IntegraBNER: Integrating Multimodal Data Sources for Enhanced Biological Named Entity Recognition | en_US |
| dc.type | Thesis | en_US |