Digital Repository

AudioSangraha - An Approach Transforming Sinhala Audio into Summaries

Show simple item record

dc.contributor.author Mohamed Shafy, Mohamed Aqeel
dc.date.accessioned 2025-06-19T07:39:31Z
dc.date.available 2025-06-19T07:39:31Z
dc.date.issued 2024
dc.identifier.citation Mohamed Shafy, Mohamed Aqeel (2024) AudioSangraha - An Approach Transforming Sinhala Audio into Summaries. BSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 20200705
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/2692
dc.description.abstract "Language is a unique form of communication between human beings with their environment. In that the most natural way of communicating with others is through the voice. For this nowadays there are many speech technologies available for a range of tasks. But still there is a prominent research area on speech recognition and summarization tasks on low resource language. Sinhala language is also one of the low resource languages, as there aren’t enough resources available on the internet. Nowadays, people are not intrusive of listening to continuous audio contents. Even if they listen to continuous audios, as a result they skip and try to get the information. Due to this they might get the wrong picture of information. So, as a solution the author has proposed a system for summarizing the continuous of Sinhala audio contents. Due to this people can save their valuable time while getting the correct information through the audio easily. This system takes continuous audio files as input and generates the summary output. For the proposed system the author has trained a model using transfer learning approaches and fine tune the pre trained Whisper AI model for the Sinhala. Also, with the test set it obtained a CER of 0.3. Then the generated continuous of audio files combined as a paragraph and sent to the summarization model which contains on summarizing through the sentence scoring on word frequency approach. And then the summarization output will be generated. Keywords - Natural Language Processing, Speech Recognition, Extractive Summarization, Audio Summarization Subject Descriptors: Computing methodologies → Artificial Intelligence → Natural Language Processing → Speech Recognition Computing methodologies → Artificial Intelligence → Natural Language Processing → Text Summarization" en_US
dc.language.iso en en_US
dc.subject Sinhala Audio Summarizer en_US
dc.title AudioSangraha - An Approach Transforming Sinhala Audio into Summaries en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account