dc.contributor.author |
Thassim, Areefa |
|
dc.date.accessioned |
2023-01-13T03:59:02Z |
|
dc.date.available |
2023-01-13T03:59:02Z |
|
dc.date.issued |
2022 |
|
dc.identifier.citation |
Thassim, Areefa (2022) An Abstractive Approach to Podcast Text Summarization. MSc. Dissertation, Informatics Institute of Technology |
en_US |
dc.identifier.issn |
20200385 |
|
dc.identifier.uri |
http://dlib.iit.ac.lk/xmlui/handle/123456789/1411 |
|
dc.description.abstract |
"Podcasts are a medium of entertainment becoming increasingly popular. When
transcribed they are a rich source of data for natural language processing tasks.
Podcasts are diverse in structure, are of varying length and tend to explore topics in a
conversational manner. Text summarization is a field of NLP that is complex due to
the difficulty in generating summaries that are concise and grammatically correct.
These factors cause generating abstractive summaries more difficult.
This research aims to generate abstractive summaries with minimal grammatical
errors and minimal redundant data for podcast transcripts using deep learning
models. The state-of-the-art model, GPT-Neo has been finetuned to achieve this. The
research solution includes a Flask API to use the GPT-Neo finetuned model and a
ReactJS based web application for efficient summarization.
Evaluation of the research model consisted of manual evaluation of generated
summaries and the ROUGE score. Manual evaluation concluded that most summaries
were concise and grammatically correct. However, around 20% of resulting
summaries faced faults such as repetition of phrases. The ROUGE score analysis
showcased that a satisfactory ROUGE-1 metric was achieved exceeding existing
research.
Overall, the research provides a novel design and approach for summarization of
podcast transcripts, with interesting results which could be used to further extend
research in this domain." |
en_US |
dc.language.iso |
en |
en_US |
dc.subject |
Podcasts |
en_US |
dc.subject |
Text summarization |
en_US |
dc.subject |
GPT-Neo |
en_US |
dc.title |
An Abstractive Approach to Podcast Text Summarization |
en_US |
dc.type |
Thesis |
en_US |