Abstract:
"Undergraduate dropout is one of the biggest concerns in higher education institutes. This has become a significant concern locally as well as globally. Student retention has gained more attention from university administrators, especially those in private-sector higher education institutes as the competition is quite high in the private sector. This research’s main objective is to predict undergraduate dropouts in the Information Technology Degree Program of a non-state higher education institution in Sri Lanka.
Logistic Regression, Random Forest, Naïve Bayes, Artificial Neural Network (ANN), Decision Tree, and Support Vector Machine (SVM) classification techniques were used for the prediction. According to the results, SVM has the best F1 score which is 90%, ANN, Decision Tree, and Logistic regression got 88%, and Random Forest and Naïve Bayes have an 87% of F1 score. It has also been identified that dropouts are high in those who have done Advanced Level in Art Stream and under Other Category.
Therefore, before students get register from those categories if faculty can give them an aptitude test and select the relevant candidates, will be helpful to reduce the dropouts. Data mining techniques can improve the quality of education in non-state higher education institutes as this helps to identify the hidden patterns of educationally linked data."