An Indonesian Chatbot for Disease Diagnosis Using Retrieval-Augmented Generation

Muhammad Adrinta Abdurrazzaq; Edwin Lesmana Tjiong; Aulia Fasya; Michelle Hiu; Joses Tanuwidjaya

doi:10.35314/9nnkn955

Authors

Muhammad Adrinta Abdurrazzaq Universitas Kalbis Author
Edwin Lesmana Tjiong Universitas Kalbis Author
Aulia Fasya Universitas Kalbis Author
Michelle Hiu Universitas Kalbis Author
Joses Tanuwidjaya Universitas Kalbis Author

DOI:

https://doi.org/10.35314/9nnkn955

Keywords:

Retrieval-Augmented Generation, GPT-OSS, Medical Chatbot, Information Retrieval, Hybrid Ranking

Abstract

The rapid advancement of Large Language Models (LLMs) has enabled their use in medical information systems, although challenges such as hallucinations, domain mismatches, and the lack of a verified knowledge base remain significant, particularly in low-source languages like Indonesian. This study introduces an Indonesian-language medical chatbot based on the open-source GPT-OSS-20B model enhanced through a Retrieval-Augmented Generation (RAG) pipeline. The system combines semantic retrieval using jina-embeddings-v3, lexical re-ranking with the BM25 algorithm, and a lightweight Logistic Regression-based domain filter as an initial filter to prevent out-of-domain LLM usage. Evaluation using Indonesian medical articles and annotated patient-doctor conversations shows that the domain filter works well on synthetic data but results in misclassification of natural queries. A hybrid weighted reranker (FAISS L2 + BM25) performed the best with a Top-30 accuracy of 0.699. Black-box testing indicates that the system flow functions as designed, although the response quality has not been validated by clinical experts. These findings suggest that RAG-based open-source LLMs can improve access to Indonesian-language medical information, but still have important limitations such as the lack of clinical validation, potential errors in scraped data, and suboptimal robustness of domain filters.