Abstract:
Researcher have proposed and implemented Explainable AI (XAI) techniques for spam email and SMS detection using deep learning models. The primary goal is to enhance the transparency, interpretability, and trustworthiness of these models while maintaining high accuracy and compliance with ethical and regulatory standards. By integrating attention mechanisms, the research aims to provide clear, understandable explanations for the decisions made by complex AI systems.
The study begins with an extensive literature review to identify the current challenges in spam detection and the limitations of existing AI models in terms of explainability. The proposed solution involves the use of Bidirectional Long Short-Term Memory (BiLSTM) networks, enhanced with attention mechanisms, to focus on the most relevant parts of the text that contribute to the classification decisions.
A detailed methodology section outlines the process of data collection, preprocessing, and model training. Comprehensive testing and evaluation of the model are conducted using various performance metrics such as accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC). The results show significant improvements in these metrics, demonstrating the model's effectiveness in accurately classifying spam messages.
Additionally, the attention mechanism provides valuable insights into the decision-making
process, making the model's predictions more transparent and understandable.
Overall, this dissertation makes significant contributions to the field of Explainable AI in
cybersecurity, particularly in the context of spam detection. The integration of attention
mechanisms with deep learning models not only improves classification accuracy but also
enhances the interpretability of AI systems, fostering trust and adoption in real-world applications. The findings offer valuable insights for researchers and practitioners aiming to develop more transparent and reliable AI systems for spam detection and beyond.