Crypto and Forex Trading Scam Tweets Detection System

Seyed, Ruzaik

dc.contributor.author	Seyed, Ruzaik
dc.date.accessioned	2024-03-04T04:02:06Z
dc.date.available	2024-03-04T04:02:06Z
dc.date.issued	2023
dc.identifier.citation	Seyed, Ruzaik (2023) Crypto and Forex Trading Scam Tweets Detection System. BSc. Dissertation, Informatics Institute of Technology	en_US
dc.identifier.issn	2018439
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/1809
dc.description.abstract	Spam detection on Twitter using text classification is a well-developed topic of NLP. A lot of research is already being undertaken in this field, and the outcomes of that research have made this field fully developed given the existing resource constraints. However, because to the complexity and constraints of numerous processes in the systematic NLP techniques, detecting crypto/forex trading scams is one of the specialist topics that hasn't been addressed previously. This research provides an overview of how to detect potential scam tweets for crypto/forex trading using an ensemble technique for a binary classification model. Firstly, beginning with pre-processing the tweets to clean and format them for modeling. Then, the dataset was divided into training and testing sets using train-test-split technique, and a Random Forest Classifier model was fitted using the training set. For each tweet in the dataset, the aggregated predictions from each tree result in a final prediction. A benchmark analysis is performed utilizing multiple algorithm approaches such as Support Vector Machine (SVM), Logistic Regression (LR), KNeighborsClassifier (KN), Random Forest Classifier (RF) etc. So that to get the most suitable and the classification method with the lowest possible error rate for the implementation of the prototype. According on the findings of the benchmark analysis, the Random Forest Classifier Method surpasses other approaches by giving an overall-accuracy-result of 98.7% on unseen data. Therefore, the Random Forest Classifier technique was taken into account considering how well it performed with the predictions.	en_US
dc.language.iso	en	en_US
dc.publisher	IIT	en_US
dc.subject	Natural Language Process	en_US
dc.subject	Binary Classification	en_US
dc.subject	Random Forest Classifier	en_US
dc.title	Crypto and Forex Trading Scam Tweets Detection System	en_US
dc.type	Thesis	en_US