Digital Repository

Disease Synonym Generalization in BioNER

Show simple item record

dc.contributor.author Sivanesathurai, Tharagan
dc.date.accessioned 2026-04-10T07:24:21Z
dc.date.available 2026-04-10T07:24:21Z
dc.date.issued 2025
dc.identifier.citation Sivanesathurai, Tharagan (2025) Disease Synonym Generalization in BioNER. BSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 20210273
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/3157
dc.description.abstract Problem: For activities like information extraction, clinical decision-making, and literature mining, Biomedical Named Entity Recognition (BioNER) is essential. Nevertheless, the majority of current BioNER systems mostly rely on surface- form matching, which restricts their capacity to identify disease terms that are synonymous and expressed in a variety of ways in the biomedical literature. Missed entities and decreased usefulness in downstream applications result from this flaw. In order to improve semantic comprehension and span-level synonym retrieval, this study introduces a collaborative learning architecture that combines BioNER with contrastive learning for synonym generalization. Methodology: The suggested approach integrates a contrastive learning module intended to embed and retrieve disease synonyms utilizing span-level representations with a refined BioBERT model for disease entity recognition. After completing preparatory procedures like sentence segmentation, BIO tagging, and synonym-based data augmentation, the model is trained on the NCBI Disease Corpus and BC5CDR, two publically accessible datasets. The model is deployed through a Chrome plugin that does real-time disease recognition and synonym suggesting, and it is delivered via a FastAPI backend. Metrics including precision, recall, F1-score, Recall@5, and Mean Reciprocal Rank (MRR) are used for evaluation. Results: The contrastive head obtained a Recall@5 of 0.5506 and an MRR of 0.5208 for synonym retrieval, but the BioNER model obtained a precision of 76%, recall of 59%, and an F1-score of 67% for illness recognition. These outcomes demonstrate how well the integrated strategy works to enhance synonym-aware entity recognition. Researchers, students, and medical professionals may now more easily access biomedical material thanks to the browser-based deployment, which also improves real-world usability. en_US
dc.language.iso en en_US
dc.subject Biomedical Machine Learning en_US
dc.subject Real Time Disease Recognition en_US
dc.title Disease Synonym Generalization in BioNER en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account