Implementation of Retrieval-Augmented Generation Method on Large Language Model for Development of Campus Service and Information Chatbot
DOI:
https://doi.org/10.35314/3y9hy151Keywords:
Chatbot, Hallucination, Retrieval-Augmented Generation, Large Language Model, Hybrid RetrievalAbstract
Large Language Models (LLMs) have the potential to improve the quality of information services in higher education environments through responsive and natural interactions. However, LLMs are prone to generating answers that are not supported by valid knowledge sources due to knowledge cut-off limitations. This study implements Retrieval-Augmented Generation (RAG) on LLMs to build an information service chatbot for the Universitas Sains dan Teknologi Indonesia (USTI). RAG is built using a hybrid retrieval mechanism that combines dense retrieval (FAISS) and sparse retrieval (BM25) through Reciprocal Rank Fusion (RRF) and is equipped with cross-encoder reranking. The knowledge base is compiled from official and public documents obtained through the USTI website. The evaluation was conducted using 13 test queries by comparing several configurations to analyze the contribution of each component. The evaluation results show that the hybrid retrieval configuration produces the best retrieval performance with Precision@3 of 71.7%, Recall@3 of 87.5%, and NDCG@3 of 96.3%. In addition, the application of RAG improved the quality of answers compared to LLM without retrieval, as shown by an increase in BERTScore-F1 from 84.8% to 89.4% and a faithfulness score of 88.8%. These findings indicate that RAG integration improves the relevance of LLM answers to source documents, with the hybrid configuration providing an optimal balance between retrieval quality and faithfulness.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 INOVTEK Polbeng - Seri Informatika

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

