Detecting AI-Generated Text with User-Centric Explainability

Gunarathne, Rusini

dc.contributor.author	Gunarathne, Rusini
dc.date.accessioned	2026-03-24T09:07:26Z
dc.date.available	2026-03-24T09:07:26Z
dc.date.issued	2025
dc.identifier.citation	Gunarathne, Rusini (2025) Detecting AI-Generated Text with User-Centric Explainability. BSc. Dissertation, Informatics Institute of Technology	en_US
dc.identifier.issn	20200205
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/3054
dc.description.abstract	The rapid advancement of large language models has made AI-generated text increasingly difficult to distinguish from human-written content, raising concerns around content authenticity, misuse, and trust. Existing detection systems often function as “black boxes,” offering limited interpretability and minimal insight into how predictions are made. This lack of transparency undermines user confidence, especially in domains where explainability is essential. To address these challenges, this study introduces AletheiaAI, a novel AI-generated text detection system designed not only to classify text as human- or AI-generated but also to provide clear, user-centric explanations that enhance interpretability and trust. AletheiaAI leverages the OPT-1.3B model with Low-Rank Adaptation (LoRA), enabling efficient fine-tuning by freezing most parameters during training. Text is preprocessed through tokenization, stopword filtering, and cleaning before being transformed into embeddings for binary classification. To support explainability, the system incorporates a dual-explanation framework: LIME provides token-level visual explanations, while Mistral-7B-Instruct generates natural-language justifications to clarify the reasoning behind each prediction. The model was trained on monolingual English data from the M4 dataset, covering sources such as Wikipedia, Reddit, WikiHow, and arXiv abstracts, using a 70/20/10 train-validation-test split. Performance was evaluated using standard metrics including accuracy, precision, recall, and F1 score. Initial results demonstrate that AletheiaAI achieves strong detection performance, with 94.49% accuracy and precision, recall, and F1 scores each at 94%. The combined use of visual and textual explanations proved effective in helping users understand model decisions, thereby strengthening transparency and trust. These findings highlight AletheiaAI’s potential as a practical, responsible solution for AI-generated text detection, particularly in contexts requiring both high accuracy and interpretability.	en_US
dc.language.iso	en	en_US
dc.subject	AI Text Detection	en_US
dc.subject	Explainable AI	en_US
dc.subject	Free Text Explanations	en_US
dc.subject	User Trust	en_US
dc.title	Detecting AI-Generated Text with User-Centric Explainability	en_US
dc.type	Thesis	en_US