Digital Repository

Hybrid Solution for Low Resource Neural Machine Translations From English to Sinhala

Show simple item record

dc.contributor.author Rajapaksha, Venura
dc.date.accessioned 2025-06-17T05:52:29Z
dc.date.available 2025-06-17T05:52:29Z
dc.date.issued 2024
dc.identifier.citation Rajapaksha, Venura (2024) Hybrid Solution for Low Resource Neural Machine Translations From English to Sinhala . BSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 2019294
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/2611
dc.description.abstract "Neural Machine Translation (NMT) has experienced rapid advancements in recent years, particularly with the advent of large language models such as BERT and GPT. These developments have streamlined the translation process. However, these large language models (LLMs) heavily depend on extensive datasets to achieve optimal performance, presenting challenges for low-resource languages such as Sinhala. Moreover, translating special phrases such as idioms poses additional difficulties, as existing methods typically provide only word-for-word translations of idioms. In this study, the author aims to create a specialized dataset to train an NMT model capable of accurately translating English idioms into their corresponding Sinhala meanings. The proposed system will incorporate a primary model to handle idiom translations and a fail-safe mechanism that leverages existing OpenAI models and the Google Translate API. This fail-safe mechanism will activate if the primary model's output does not meet a predefined accuracy threshold, as determined by comparing the BLEU (Bilingual Evaluation Understudy) score of its translations against this threshold. If the BLEU score is below the threshold, the fail-safe mechanism will take over the translation task. According to the conducted tests and evaluation utilizing the BLEU score, when compared against Google Translate, which is currently the best translator for special phrases in low-resource settings, the proposed system provides a considerably better output (14.09 for Google Translate vs. 72.83 for the proposed system)." en_US
dc.language.iso en en_US
dc.subject Machine translations (MT) en_US
dc.subject NMT (Neural Machine translations) en_US
dc.subject English as second language (ESL) en_US
dc.title Hybrid Solution for Low Resource Neural Machine Translations From English to Sinhala en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account