Parameter Efficient Diverse Paraphrase Generation Using Sequence-Level Knowledge Distillation

Jayawardena, Lasal; Yapa, Prasan

Home
→
Conference Papers, Journal Articles
→
2024 Conference Papers & Journal Articles
→
Conferance Papers
→
View Item

dc.contributor.author	Jayawardena, Lasal
dc.contributor.author	Yapa, Prasan
dc.date.accessioned	2025-04-23T07:14:58Z
dc.date.available	2025-04-23T07:14:58Z
dc.date.issued	2024
dc.identifier.citation	Jayawardena, L. and Yapa, P. (2024) ‘Parameter Efficient Diverse Paraphrase Generation Using Sequence-Level Knowledge Distillation’, in 2024 5th International Conference on Advancements in Computational Sciences (ICACS). 2024 5th International Conference on Advancements in Computational Sciences (ICACS), pp. 1–12. Available at: https://doi.org/10.1109/ICACS60934.2024.10473289.	en_US
dc.identifier.uri	https://ieeexplore.ieee.org/document/10473289
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/2264
dc.description.abstract	Over the past year, the field of Natural Language Generation (NLG) has experienced an exponential surge, largely due to the introduction of Large Language Models (LLMs). These models have exhibited the most effective performance in a range of domains within the Natural Language Processing and Generation domains. However, their application in domain-specific tasks, such as paraphrasing, presents significant challenges. The extensive number of parameters makes them difficult to operate on commercial hardware, and they require substantial time for inference, leading to high costs in a production setting. In this study, we tackle these obstacles by employing LLMs to develop three distinct models for the paraphrasing field, applying a method referred to as sequence-level knowledge distillation. These distilled models are capable of maintaining the quality of paraphrases generated by the LLM. They demonstrate faster inference times and the ability to generate diverse paraphrases of comparable quality. A notable characteristic of these models is their ability to exhibit syntactic diversity while also preserving lexical diversity, features previously uncommon due to existing data quality issues in datasets and not typically observed in neural-based approaches. Human evaluation of our models shows that there is only a 4% drop in performance compared to the LLM teacher model used in the distillation process, despite being 1000 times smaller. This research provides a significant contribution to the NLG field, offering a more efficient and cost-effective solution for paraphrasing tasks.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.subject	Natural Language Processing	en_US
dc.subject	Knowledge distillation	en_US
dc.subject	large language models	en_US
dc.subject	Deep Learning	en_US
dc.title	Parameter Efficient Diverse Paraphrase Generation Using Sequence-Level Knowledge Distillation	en_US
dc.type	Article	en_US