Abstract:
This research addresses the significant challenge of low-resource Neural Machine Translation (NMT) between Sinhala and Tamil, two major languages of Sri Lanka that suffer from a critical lack of parallel corpora. The scarcity of direct training data leads to poor translation quality. While conventional pivot-based translation (Sinhala → English → Tamil) is a common workaround, it is plagued by error propagation, semantic loss, and computational inefficiency, ultimately diminishing translation accuracy.
The study proposes an innovative solution: **Pivot-Based Transfer Learning**. The core aim is to enhance Sinhala-Tamil NMT accuracy by leveraging the linguistic knowledge embedded in large pre-trained multilingual models, specifically **mBART**, through a parameter-efficient approach called **adapter-based fine-tuning (LoRA)**. This method retains the base model's knowledge while adapting it quickly and efficiently to the low-resource language pair, mitigating the accumulation of errors inherent in traditional pivoting.