Abstract:
A Scientific method of selecting all possible fraudulent combinations from the weekly pre-auction, where tea samples are bided by the tea exporting companies to be purchased for exporting purposes is critical to mitigate frauds in tea exports from Sri Lanka. Therefore, building a proper mechanism to predict fraudulent samples received by the tea tasting unit of the Sri Lanka Tea Board is a challenge for the government's regulatory/testing body. The main focus of this paper is to identify a suitable pre auction sample selection algorithm with Machine Learning approach to develop a self-learning sampling system to pick the most likely fraudulent samples for physical testing. The performance impact of tea tasting data feature set is demonstrated on Locally Weighted Learning, Logistic Regression, Random Forest, C4.5 Naïve Bayes and Bayesian Network. Though classification accuracy between algorithms is similar, computational performance can differ ominously. Thus, we then show that it is worthwhile differentiating algorithms based on computational performance rather than on classification accuracy alone.