Digital Repository

An Abstractive Approach to Podcast Text Summarization

Show simple item record

dc.contributor.author Thassim, Areefa
dc.date.accessioned 2023-01-13T03:59:02Z
dc.date.available 2023-01-13T03:59:02Z
dc.date.issued 2022
dc.identifier.citation Thassim, Areefa (2022) An Abstractive Approach to Podcast Text Summarization. MSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 20200385
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/1411
dc.description.abstract "Podcasts are a medium of entertainment becoming increasingly popular. When transcribed they are a rich source of data for natural language processing tasks. Podcasts are diverse in structure, are of varying length and tend to explore topics in a conversational manner. Text summarization is a field of NLP that is complex due to the difficulty in generating summaries that are concise and grammatically correct. These factors cause generating abstractive summaries more difficult. This research aims to generate abstractive summaries with minimal grammatical errors and minimal redundant data for podcast transcripts using deep learning models. The state-of-the-art model, GPT-Neo has been finetuned to achieve this. The research solution includes a Flask API to use the GPT-Neo finetuned model and a ReactJS based web application for efficient summarization. Evaluation of the research model consisted of manual evaluation of generated summaries and the ROUGE score. Manual evaluation concluded that most summaries were concise and grammatically correct. However, around 20% of resulting summaries faced faults such as repetition of phrases. The ROUGE score analysis showcased that a satisfactory ROUGE-1 metric was achieved exceeding existing research. Overall, the research provides a novel design and approach for summarization of podcast transcripts, with interesting results which could be used to further extend research in this domain." en_US
dc.language.iso en en_US
dc.subject Podcasts en_US
dc.subject Text summarization en_US
dc.subject GPT-Neo en_US
dc.title An Abstractive Approach to Podcast Text Summarization en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account