Abstract:
A large number of news articles are created and published every day, and users can't browse
through all available news to seek their interested news information. There is a lack of
standardized datasets for Sinhala, which makes it challenging to train and evaluate NLP models effectively. A combination of extractive and abstractive summarization with domain-specific fine-tuned pre-trained model. The model is trained with a custom-made, domain-specifically categorized data set to get the characteristics of each news domain.