Digital Repository

“Pulse Lanka” - An NLP model with Opinion Mining to Classify the Code-Mixed Code-Switched Singlish Reviews in Sri Lankan E-Commerce Platforms with Emoji Interpretation

Show simple item record

dc.contributor.author Gunawardana, Adeesha
dc.date.accessioned 2025-06-04T04:49:23Z
dc.date.available 2025-06-04T04:49:23Z
dc.date.issued 2024
dc.identifier.citation Gunawardana, Adeesha (2024) “Pulse Lanka” - An NLP model with Opinion Mining to Classify the Code-Mixed Code-Switched Singlish Reviews in Sri Lankan E-Commerce Platforms with Emoji Interpretation. BSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 20210348
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/2415
dc.description.abstract "Sri Lankan E-Commerce platforms are rapidly evolving, Understanding the sentiment through reviews has become a critical challenge, particularly due to the prevalent use of code-mixed and code-switched that combines English with Sinhalese known as Singlish. This linguistic complexity significantly hinders automated opinion mining tools, which are preliminary designed for monolingual text. As a result, businesses and researchers, alike must rely on a manual labour to decipher these reviews. This process is not only time consuming but also requires a nuanced understanding of both English and Sinhalese. The dependency on human analysis limits the scalability of the sentiment extraction and poses a significant bottleneck in harnessing the full potential of consumer feedback. This study aims to address the gap in the automated Sentiment Analysis for Singlish reviews on Sri Lankan E-Commerce platforms, highlighting the need for advanced solutions that can navigate the intricacies of code-switching and code-mixing to extract the sentiments efficiently and accurately. To tackle the challenge of analysing sentiment in Singlish reviews on Sri Lankan E-Commerce platforms, the author proposed an innovative solution that circumvents the complexities of code- mixed and code-switched language. The Core of this solution involves initially taking the raw, code-mixed code-switched reviews and transforming them into pure Sinhala Unicode through a sophisticated transliteration process. This critical step ensures that the linguistic nuances and cultural context embedded in the original reviews are preserved and made more accessible for computational analysis. Following transliteration, the Sinhala Unicode texts are then translated into English, creating a uniform dataset that is more amendable to analysis with conventional NLP techniques. By employing variety of NLP methodologies to this translated corpus, the study efficiently extracts and interprets the sentiment expressed in the original reviews. When evaluating the results, the study compared several ML models, notably Naïve Bayes and Random Forest gave the best results. Leveraging these insights, an ensemble of these two models further enhanced key performance metrics – F1 Score, Precision and Recall. This approach underscored the potential of combining ML techniques to more accurately interpret consumer sentiments." en_US
dc.language.iso en en_US
dc.subject Natural Language Processing en_US
dc.subject Opinion Mining en_US
dc.subject Code-Mixing en_US
dc.title “Pulse Lanka” - An NLP model with Opinion Mining to Classify the Code-Mixed Code-Switched Singlish Reviews in Sri Lankan E-Commerce Platforms with Emoji Interpretation en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account