Custom NER model for pandemic outbreak surveillance using Twitter

Jayathilake, Hansinie Madushika

Home
→
Dissertations & Thesis
→
MSc Bigdata Analytics
→
2021
→
View Item

Custom NER model for pandemic outbreak surveillance using Twitter

Jayathilake, Hansinie Madushika

URI: http://dlib.iit.ac.lk/xmlui/handle/123456789/764

Date: 2021

Abstract:

Pandemic outbreaks have become the most significant discussion all over the world at the time of writing this report. Pandemics can cause serious public health consequences if no action was taken in advance, since outbreaks are out of control once stepped out. Hence, getting early awareness about outbreaks is essential. Therefore, outbreak surveillance is highly important. Nowadays people are used to rely more on social media for receiving and reporting urgent public health matters especially pandemics outbreaks. Twitter is playing a huge role in this regard. Therefore, this study was conducted to develop a pandemic outbreak surveillance system with aid of natural language processing and deep learning techniques by using Twitter data. In this study Twitter data was extracted and applied text preprocessing techniques for model training. Training and test data were annotated with label “Pandemic” by a custom annotator written by the author. This study trained two custom named entity recognition (NER) models to predict pandemic named entities using Python spaCy library which is based on deep neural network architecture. These models were trained with two optimizers such as Stochastic Gradient Descent (SGD) and Adaptive Moment Estimation (Adam) algorithms. Confusion matrix was used to evaluate the model performance. Based on the results, Adam optimizer based NER model with 0.9691 precision was selected as the best model for the study. Using python web framework FastAPI, the model was deployed for a web-based surveillance dashboard to visualize the country wise results in map and chart views based on user input parameters.

Show full item record