2018

2018 http://dlib.iit.ac.lk/xmlui/handle/123456789/19 Mon, 04 May 2026 18:47:59 GMT 2026-05-04T18:47:59Z Effective Pre-auction sample selection through a machine learning model for Sri Lanka Tea Board http://dlib.iit.ac.lk/xmlui/handle/123456789/31 Effective Pre-auction sample selection through a machine learning model for Sri Lanka Tea Board Wijesinghe, W M I H A Scientific method of selecting all possible fraudulent tea samples from the weekly pre- auction; where tea samples will be bided by the tea exporting companies to be purchased for exporting purposes is critical to mitigate frauds in tea exports from Sri Lanka. There was no such system built to predict fraudulent samples tea samples in tea industry, but there are some research being conducted to predict anomalies in other domains. Therefore, building a proper mechanism to predict fraudulent samples, received by the tea tasting unit of the Sri Lanka Tea Board is a challenge for the government’s regulatory/testing body. The focus of this research is to implement a pre-auction sample selection system with Machine Learning approach to develop a self-learning sampling system to pick the most likely fraudulent samples for physical testing. This report aims at producing optimized Automated Sample Selection Model with Machine Learning algorithm. The training dataset obtain from main tea board MS SQL database with a customized query. The core algorithm used for the system is Random Forest (RF) algorithm and it optimized with values obtained in Receiver Operating Characteristic (ROC) curve and Precision and Recall curve. The model implements with well-known web based micro service- based architecture to keep efficiency and scalability of the product. The Flask was use as RESTful API that offers web services with Python. In addition, Jinja2, JavaScript, ML based open source libraries and technologies were used to finalize the product. Results obtained through the system were impressive as the system offered higher accuracy and could be used to predict frauds in other domains with modifications. Furthermore, researcher introduced modification for the current architecture and IOT based intelligent fraud detection mechanism with Google Machine Learning Platform (GMLP) as Future enhancement. Mon, 01 Jan 2018 00:00:00 GMT http://dlib.iit.ac.lk/xmlui/handle/123456789/31 2018-01-01T00:00:00Z Automatic Summarization of Privacy policies using Deep Learning http://dlib.iit.ac.lk/xmlui/handle/123456789/30 Automatic Summarization of Privacy policies using Deep Learning Wannakuwatte, Devmie With the increase in web users and websites, the safety of the users’ privacy online is a major concern. When accessing a website or using a computer software, all users are presented with a Privacy Statement, that details, what and how any information is collected from the users. Most of the users do not take time to read this policy, even if they do read, most of the time the language is hard to grasp for an average user. To assist in this task, this research aims to create a tool that summarizes the content of a privacy policy, to identify what information is being captured via a privacy policy. The proposed application will take a full text privacy policy as an input and will generate a visual representation of what information is being collected. The model used for this information retrieval is a GUR based Recurrent Neural Network for the text classifier which can identify 8 different categories of information that is collected. Based on the testing carried out, this model yielded an accuracy of 0.81 on test data. The average wait time to get an output was about 497 ms, for a policy of average 2000 words. Based on the results, the research can be considered as a successful attempt in summarizing privacy policies, which may give the end user an advantage in understanding the content of privacy policies in a much easier manner. Mon, 01 Jan 2018 00:00:00 GMT http://dlib.iit.ac.lk/xmlui/handle/123456789/30 2018-01-01T00:00:00Z Automatic Sarcasm Detection by leveraging Conversational Context http://dlib.iit.ac.lk/xmlui/handle/123456789/29 Automatic Sarcasm Detection by leveraging Conversational Context Shariq, Mohomed Shafeek The dawn of Web 2.0 has seen a massive growth in user generated content in the web. Almost all web sites and applications today provide some form of interaction with its users, allowing them to share their experience, opinions and experience. Furthermore, the popularity of social media has provided a platform for users to voice their opinions freely. The growth of content shared by the users paved the path to the study or sentiment analysis and opinion mining. While there are many approaches being studied in the field of sentiment analysis, one of the main challenges in identifying sentiment accurately is the presence of sarcasm in social media and online forums. Automatic sarcasm detection is paramount for improving the outcomes of sentiment analysis. The popularity of deep learning in NLP has prompted researchers to investigate the use of deep learning for sarcasm detection. Recent studies have investigated the ability to leverage contextual information in predicting sarcasm. This research strives to leverage the conversational context to detect sarcasm in a particular domain. Topics such as politics have proved to elicit sarcasm more than some other topic, hence this research primarily focus on predicting sarcasm in discussions on politics. DBPedia Spotlight was used to identify statements related to politics. Many different deep learning architectures were researched on to detect sarcasm by leveraging information from the conversation context. The study shows that it was able to obtain very good results using this approach. Moreover, Transfer Learning methods were researched on to predict sarcasm in conversations related to sports. The study shows the Transfer Learning can be used effectively to train models with comparatively smaller training data set and still receive good results. Mon, 01 Jan 2018 00:00:00 GMT http://dlib.iit.ac.lk/xmlui/handle/123456789/29 2018-01-01T00:00:00Z Serverless Computing Architecture for Fingerprint-based Audio Duplicate Identification http://dlib.iit.ac.lk/xmlui/handle/123456789/28 Serverless Computing Architecture for Fingerprint-based Audio Duplicate Identification Randeniya, Ishan This research explorers a distributed algorithm for similarity search in large audio database which has high scalability. Even though the database had been scaled, processing time will be an average time for all levels of scaling. Depending on the background and the previous work, and the rational is to find the best matching audio and the comparison undertake to find the best matching sequential frequencies it is the most suited method that is using audio-fingerprints. Instead of comparing the complete clips, the way that uses fingerprints and hashes more efficient due to the comparison is only for the particular hashes. Although it doesn’t go through each files or hash. The research describes a serverless architecture to identify audios based on the fingerprint-based approach. For the fingerprinting it has identified Min-Hash is the suitable hashing algorithm. Mon, 01 Jan 2018 00:00:00 GMT http://dlib.iit.ac.lk/xmlui/handle/123456789/28 2018-01-01T00:00:00Z