Digital Repository

Enhancing Academic Assistance with a Novel Retrieval-Augmented Generation Chatbot: A Solution to Hallucination in Large Language Models

Show simple item record

dc.contributor.author Nanayakkara, Shadheera
dc.date.accessioned 2026-03-24T06:46:48Z
dc.date.available 2026-03-24T06:46:48Z
dc.date.issued 2025
dc.identifier.citation Nanayakkara , Shadheera (2025) Enhancing Academic Assistance with a Novel Retrieval-Augmented Generation Chatbot: A Solution to Hallucination in Large Language Models. BSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 20200147
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/3046
dc.description.abstract Large Language Models have restructured natural language processing by the ability to produce fluent, human-like answers. However, their tendency to cause hallucinations- believable, but factually inaccurate material is a serious danger in the academic sphere, where precision and stability are essential. Moreover, rule-based or intent-based traditional educational chatbots lack the necessary flexibility and fail to support complex and context-related student requests. This deficit creates a wide gap in reliable, vibrant academic support for university learners. This project suggests EdRAG, a chatbot based on Retrieval-Augmented Generation that constructs a hybrid retrieval system comprising two channels that synthesise structured triples of knowledge with unstructured lecture information. Dense vector search and sparse BM25 retrieval runs in parallel to maximise semantic and lexical coverage. The material gathered is then passed on to a memory-aware prompt construction module, which encodes recent conversational context to keep things coherent. In order to achieve academic relevance, two domain-specific datasets were developed. EdCyberQ, an open-ended question-answering dataset, and EdRAG knowledge triples. The evaluation measures are BERTScore (Precision: 0.8920, Recall: 0.9168, F1: 0.9042) and RAGAS metrics (Faithfulness: 0.8576, Answer Relevancy: 0.8486, Context Precision: 0.8147, Context Recall: 0.8641). Comparisons with GPT-4o in the same retrieval pipeline show that LLaMA 3.3 70B is better at both factual grounding and answer relevance than GPT-4o, despite it being an open-source model. These findings indicate that EdRAG is an effective way to reduce hallucinations and provide precise academic assistance that has a domain-generalizable solution to AI-enhanced learning environments. en_US
dc.language.iso en en_US
dc.subject Retrieval Augmented Generation en_US
dc.subject Search Mechanisms en_US
dc.subject Language Models en_US
dc.title Enhancing Academic Assistance with a Novel Retrieval-Augmented Generation Chatbot: A Solution to Hallucination in Large Language Models en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account