HateChecker – Hate Speech Detection System   in Sinhala - English code - mixed language

Liyanage, U. L. O. G

dc.contributor.author	Liyanage, U. L. O. G
dc.date.accessioned	2022-03-16T09:11:55Z
dc.date.available	2022-03-16T09:11:55Z
dc.date.issued	2021
dc.identifier.citation	Liyanage, U. L. O. G (2021) HateChecker – Hate Speech Detection System in Sinhala - English code - mixed language. BSc. Dissertation Informatics Institute of Technology	en_US
dc.identifier.issn	2017594
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/1028
dc.description.abstract	" With the steady increase internet usage, the propagation of hate speech on the internet also steadily increases. Social media sites, review forums, micro blogging sites encourage users to convey their thoughts with minimum restrictions. This leads to expressing hate towards others who do not believe their beliefs. This study focuses on identifying hate speech texts that are written in Sinhala-English code-mixed language (Singlish) which is mostly used by Sri Lankans on the internet. Due to the unavailability of Sinhala-English code-mixed datasets, dataset was created using comments on YouTube and Facebook. Eight machine learning algorithms and three ensemble approaches were developed to detect hate speech in Singlish and their performance were evaluated in terms of accuracy, f1-score precision and recall. Support Vector Machine (SVM), Multinominal Naïve Bayes (MNB), Bernoulli Naïve Bayes (BNB) and Logistic Regression classifiers were used to develop ensemble approaches. They were developed using soft voting, hard voting, and stacking. The stacking approach outperformed other baseline algorithms with 85.71% accuracy and 83.78% f1-score.. "	en_US
dc.language.iso	en	en_US
dc.subject	Singlish	en_US
dc.subject	Stacking	en_US
dc.subject	Ensemble Approach	en_US
dc.subject	Sinhala-English Code-Mixed Language	en_US
dc.subject	Hate Speech Detection	en_US
dc.title	HateChecker – Hate Speech Detection System in Sinhala - English code - mixed language	en_US
dc.type	Thesis	en_US