Digital Repository

TamilSpeech: Advancing TTS with Cutting-Edge Deep Learning Technologies

Show simple item record

dc.contributor.author Siritharran, Kullaleeni
dc.date.accessioned 2025-07-01T06:30:55Z
dc.date.available 2025-07-01T06:30:55Z
dc.date.issued 2024
dc.identifier.citation Siritharran, Kullaleeni (2024) TamilSpeech: Advancing TTS with Cutting-Edge Deep Learning Technologies. MSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 20191240
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/2814
dc.description.abstract "This research focuses on the text to speech system for Tamil language with cutting edge deep learning technologies. The goal is to synthesize tamil speech for the given tamil text. Text-to-Speech (TTS) systems have demonstrated their immense value by transforming written text into audible speech, providing benefits across diverse user groups. The emphasis on Tamil arises from the necessity to embrace linguistic diversity. Most significantly, Text-to-Speech (TTS) technology plays a vital role in aiding visually impaired individuals to navigate digital devices and manage their daily lives. Tamil is an ancient classical language spoken by 15% of Sri Lanka's population and the worldwide Tamil-speaking population is estimated to be around 80 million, making it one of the most widely spoken languages in the world. It poses challenges in automated speech synthesis due to its unique characters and diverse phonetic sounds. The motivation of this study is due to the lack of research on Tamil text to speech applications with cutting edge deep learning technologies. The research involves the implementation of advanced TTS models, including Parallel Tacotron 2 and Hifi GAN, to enhance the synthesis of natural-sounding Tamil speech. The models were built using the PyTorch neural network after thorough analysis of literature of the patent studies. Two models were built on the data to perform two different roles. Text encoder model was built using Parallel Tacotron 2 and speech decoder model was built using Hifi GAN model to generate speech waveform. A user interface is created using the Flask application to implement this solution. The unit testing is done for this work. This report contains the training and validation loss evaluation as a part of evaluation." en_US
dc.language.iso en en_US
dc.subject Tamil Text-to-Speech (TTS) en_US
dc.subject Parallel Taco Tron 2 en_US
dc.subject Hifi-GAN en_US
dc.title TamilSpeech: Advancing TTS with Cutting-Edge Deep Learning Technologies en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account