Implementasi Sistem Chatbot Kesehatan Berbasis Retrieval-Augmented Generation (RAG) dengan Dataset Medis Bahasa Indonesia

Gusti Ayu Purna Savitri; Adie Wahyudi Oktavia Gama

doi:10.55606/juisik.v6i1.2240

Authors

Gusti Ayu Purna Savitri Universitas Pendidikan Nasional
Adie Wahyudi Oktavia Gama Universitas Pendidikan Nasional

DOI:

https://doi.org/10.55606/juisik.v6i1.2240

Keywords:

Digital Health, Expert System, Health Chatbot, Indonesian, Retrieval-Augmented Generation

Abstract

The availability of accurate and easy-to-understand health information in Indonesian remains a significant challenge in the digital era. People tend to rely on unverified sources of information, potentially fueling the spread of health misinformation. This research aims to develop a Retrieval-Augmented Generation (RAG)-based health chatbot capable of providing structured medical responses with traceable references. The system is implemented using the large Qwen 2.5-7B-Instruct language model, the FAISS vector index, and a dataset containing several health questions and answers in Indonesian. The architecture is designed to understand natural language health queries, generate evidence-based responses, and include source links for independent verification. Testing results show that the system successfully answers common health questions by integrating trusted sources, implementing guardrail mechanisms in the form of clinical disclaimers and query filters in external domains, and achieving adequate response times for its initial health information assistant function. This system has been deployed as a web application and has the potential for further development as a component of Indonesia's digital health ecosystem to improve public health literacy and reduce reliance on non-medical information.

References

Baur, D., Ansorg, J., Heyde, C.-E., & Voelker, A. (2025). Development and Evaluation of a Retrieval-Augmented Generation Chatbot for Orthopedic and Trauma Surgery Patient Education: Mixed-Methods Study. JMIR AI, 4, e75262. https://doi.org/10.2196/75262

Benfenati, D., De Filippis, G. M., Rinaldi, A. M., Russo, C., & Tommasino, C. (2024). A Retrieval-augmented Generation application for Question-Answering in Nutrigenetics Domain. Procedia Computer Science, 246, 586–595. https://doi.org/10.1016/j.procs.2024.09.467

Bora, A., & Cuayáhuitl, H. (2024). Systematic Analysis of Retrieval-Augmented Generation-Based LLMs for Medical Chatbot Applications. Machine Learning and Knowledge Extraction (MAKE), 6(4), 2355–2374. https://doi.org/10.3390/make6040116

Coric, R., Oloyede, E. F., & Cuayáhuitl, H. (2026). Helpful or Harmful? Re-Evaluating Frugality in Retrieval-Augmented Generation for Medical Question Answering. Machine Learning and Knowledge Extraction (MAKE), 8(3), 64. https://doi.org/10.3390/make8030064

Firdaus, D., Sumardi, I., & Kulsum, Y. (2024). Integrating Retrieval-Augmented Generation with Large Language Model Mistral 7b for Indonesian Medical Herb. JISKa, 9(3), 230–243. https://doi.org/10.14421/jiska.2024.9.3.230-243

Haider, S. A. et al. (2025). The Development and Evaluation of a Retrieval-Augmented Generation Large Language Model Virtual Assistant for Postoperative Instructions. Bioengineering, 12(11), 1219. https://doi.org/10.3390/bioengineering12111219

Long, C. et al. (2024). ChatENT: Augmented Large Language Model for Expert Knowledge Retrieval in Otolaryngology–Head and Neck Surgery. Otolaryngology–Head and Neck Surgery, 171(4), 1042–1051. https://doi.org/10.1002/ohn.864

Meng, W., Li, Y., Chen, L., & Dong, Z. (2025). Using the Retrieval-Augmented Generation to Improve the Question-Answering System in Human Health Risk Assessment: The Development and Application. Electronics, 14(2), 386. https://doi.org/10.3390/electronics14020386

Miao, J., Thongprayoon, C., Suppadungsuk, S., Garcia Valencia, O. A., & Cheungpasitporn, W. (2024). Integrating Retrieval-Augmented Generation with Large Language Models in Nephrology: Advancing Practical Applications. Medicina, 60(3), 445. https://doi.org/10.3390/medicina60030445

Muhetaer, M., Yusupu, A., Yifan, W., Mutalipu, M., & Hao, F. (2025). Medical QA dialogue datasets in RAG systems performance evaluation and ChatGPT optimization. Scientific Reports, 15(1), 44467. https://doi.org/10.1038/s41598-025-28015-4

Nayinzira, J. P., & Adda, M. (2024). SentimentCareBot: Retrieval-Augmented Generation Chatbot for Mental Health Support with Sentiment Analysis. Procedia Computer Science, 251, 334–341. https://doi.org/10.1016/j.procs.2024.11.118

Patil, R., Abbidi, M., & Fannon, S. (2025). RAGMed: A RAG-Based Medical AI Assistant for Improving Healthcare Delivery. AI, 6(10), 240. https://doi.org/10.3390/ai6100240

Shin, M., Song, J., Kim, M.-G., Yu, H. W., Choe, E. K., & Chai, Y. J. (2025). Thyro-GenAI: A Chatbot Using Retrieval-Augmented Generative Models for Personalized Thyroid Disease Management. Journal of Clinical Medicine, 14(7), 2450. https://doi.org/10.3390/jcm14072450

Sohn, J. et al. (2024). Rationale-Guided Retrieval Augmented Generation for Medical Question Answering.

Son, N., Kang, I., Kim, I., Lee, K., Nam, S., & Lee, D. (2025). Development and Evaluation of a Retrieval-Augmented Generation-Based Electronic Medical Record Chatbot System. Healthcare Informatics Research, 31(3), 218–225. https://doi.org/10.4258/hir.2025.31.3.218

Swacha, J., & Gracel, M. (2025). Retrieval-Augmented Generation (RAG) Chatbots for Education: A Survey of Applications. Applied Sciences, 15(8), 4234. https://doi.org/10.3390/app15084234

Valan, P., & Venugopal, P. (2025). Evaluating a retrieval-augmented pregnancy chatbot: a comprehensibility–accuracy-readability study of the DIAN AI assistant. Frontiers in Artificial Intelligence, 8, 1640994. https://doi.org/10.3389/frai.2025.1640994

Xu, R., Hong, Y., Zhang, F., & Xu, H. (2024). Evaluation of the integration of retrieval-augmented generation in large language model for breast cancer nursing care responses. Scientific Reports, 14(1), 30794. https://doi.org/10.1038/s41598-024-81052-3

Zhang, S., Phan, E., Velmovitsky, P., Pham, Q., & Sanner, S. (2026). Retrieval-Augmented Generation for Medical Question Answering on a Heart Failure Dataset: Performance Analysis. JMIR Formative Research, 10, e84932. https://doi.org/10.2196/84932

Ziletti, A., & DAmbrosi, L. (2024). Retrieval augmented text-to-SQL generation for epidemiological question answering using electronic health records. Proceedings of the 6th Clinical Natural Language Processing Workshop, 47–53. https://doi.org/10.18653/v1/2024.clinicalnlp-1.4

Implementasi Sistem Chatbot Kesehatan Berbasis Retrieval-Augmented Generation (RAG) dengan Dataset Medis Bahasa Indonesia

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Most read articles by the same author(s)

MENU