Digital Repository

Solving Ambiguities in Romanized Sinhala to Sinhala Transliteration

Show simple item record

dc.contributor.author Dharmasiri, Thudawattage
dc.date.accessioned 2024-04-02T05:17:18Z
dc.date.available 2024-04-02T05:17:18Z
dc.date.issued 2023
dc.identifier.citation Dharmasiri , Thudawattage (2023) Solving Ambiguities in Romanized Sinhala to Sinhala Transliteration. BSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 2019099
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/1955
dc.description.abstract "The widespread adoption of social media and instant messaging has made it essential to communicate in one's native tongue. Romanized Sinhala and native Sinhala are both frequently used in Sinhala but attempts to use machine transliteration to transliterate Romanized Sinhala to native Sinhala can result in inaccuracies. This is due to the informal text shorthand known as ""Singlish-based shorthand words"" Rule-based transliteration systems may not be compatible with the ad hoc transliterations used in Singlish. To address this issue, a Novel hybrid approach combining rule-based machine translation and neural machine translation has been proposed. Combining the advantages of rule-based algorithms and neural machine translation, the proposed transliterator has the potential to considerably enhance reverse transliteration and improve communication in native Sinhala by combining the strengths of both approaches. Using multiple metrics, the implemented neural machine translation model has been evaluated. The BLEU score of 0.84 indicates that Sinhala transliterations generated from Romanized Sinhala text are accurate. In addition, the WER score of 0.16 demonstrates the model's ability to transcribe Sinhala text from its Romanized form accurately. The model accurately predicted the Sinhala transliteration in 84% of the test cases, as indicated by the accuracy score of 0.84. Precision and recall scores of 0.861 and 0.862, respectively, indicate that the model accurately identified Sinhala words and their transliterations. The F1-score of 0.723 indicates that the model is well-balanced regarding precision and recall. In every test case, the ROUGE-L score of 1.00 indicates that the model obtained perfect overlap between the generated and reference Sinhala transliterations. Due to the rule-based approach's low BLEU score, it was utilized as a part of the suggestion component rather than the primary transliteration component. Even though the preliminary test results are promising, additional testing and refinement are necessary to improve the overall performance of the machine translation models. The hybrid approach proposed has the potential to considerably enhance communication in native Sinhala and reverse transliteration." en_US
dc.language.iso en en_US
dc.publisher en_US
dc.subject Romanized Sinhala en_US
dc.subject Singlish en_US
dc.subject Machine transliteration en_US
dc.title Solving Ambiguities in Romanized Sinhala to Sinhala Transliteration en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account