Digital Repository

Intrinsic Plagiarism Detection in Sinhala language documents

Show simple item record

dc.contributor.author Amarasinghe, Charith Thiwanka
dc.date.accessioned 2022-02-28T05:58:26Z
dc.date.available 2022-02-28T05:58:26Z
dc.date.issued 2021
dc.identifier.citation Amarasinghe, Charith Thiwanka (2021) Intrinsic Plagiarism Detection in Sinhala language documents. MSc. Dissertation Informatics Institute of Technology en_US
dc.identifier.issn 2019574
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/789
dc.description.abstract " The research study is conducted on intrinsic plagiarism detection in Sinhala language documents. There are considerably low number of studies done on the plagiarism detection and authorship verification for Sinhala language. This research proposes an anomaly detection-based approach classify text portions based on anomalous behavior when compared to the neighboring context for the featured extracted using word embedding based approach. In the study multiple feature extraction methods and anomaly detection algorithms and supervised algorithm were used to conduct a series of experiments to identify the combination which perform best for the Sinhala languages. Study uses paragraph level features to distinguish segments with anomalistic behavior. Proposed solution was able classify plagiarized content with an accuracy of 85% with a f1-score of 0.40." en_US
dc.language.iso en en_US
dc.subject Plagiarism detection en_US
dc.subject Anomaly detection en_US
dc.subject Text analytics en_US
dc.subject Data mining en_US
dc.title Intrinsic Plagiarism Detection in Sinhala language documents en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account