Analisis Performa XGBoost dan Gaussian Naive Bayes untuk Klasifikasi Dini Penyakit Hipertensi
DOI:
https://doi.org/10.55606/juitik.v6i1.2119Keywords:
Early Detection, Gaussian Naive Bayes, Hypertension, Machine Learning, XGBoostAbstract
Hypertension is one of the leading causes of premature death globally that often goes undetected due to minimal clinical symptoms, earning it the nickname “silent killer.” The application of artificial intelligence (AI), particularly Machine Learning, is a strategic approach to early detection, but the main challenge lies in balancing diagnostic accuracy with detection sensitivity so that no patients at risk are overlooked. This study aims to analyze and compare the performance of the Extreme Gradient Boosting (XGBoost) algorithm with the Cost-Sensitive strategy compared to Gaussian Naive Bayes (GNB) as a baseline in hypertension risk classification. The dataset used included 1,985 electronic medical records with 9 clinical attributes, which were evaluated using the 10-Fold Cross-Validation method to determine model validity. The test results showed that XGBoost consistently outperformed GNB across all evaluation metrics. XGBoost recorded superior performance with an Accuracy of 92.19% and an AUC of 0.9752, far surpassing GNB, which obtained an Accuracy of 84.13%. The application of Cost-Sensitive Learning in XGBoost proved effective in overcoming performance trade-offs by producing a Recall of 91.26% and a Precision of 93.53%. Furthermore, Feature Importance analysis identified Blood Pressure History, Smoking Status, and Family History as the most dominant risk factors, which is in line with global medical guidelines. Based on these results, it is concluded that XGBoost is a more reliable and accurate method to be applied in early detection systems for hypertension compared to classical probabilistic approaches.
References
Afifah, K., Yulita, I. N., & Sarathan, I. (2021). Sentiment Analysis on Telemedicine App Reviews using XGBoost Classifier. 2021 International Conference on Artificial Intelligence and Big Data Analytics, 22–27. https://doi.org/10.1109/ICAIBDA53487.2021.9689735
Anbazhagan, T., & Rangaswamy, B. (2025). Early prediction of CKD from time series data using adaptive PSO optimized echo state networks. Scientific Reports, 15(1), 6966. https://doi.org/10.1038/s41598-025-91028-6
Badawy, M., Ramadan, N., & Hefny, H. A. (2024). Big data analytics in healthcare: Data sources, tools, challenges, and opportunities. Journal of Electrical Systems and Information Technology, 11(1), 63. https://doi.org/10.1186/s43067-024-00190-w
Barus, H. P., Robet, R., & Tarigan, F. A. (2026). Comparison of XGBoost and Naive Bayes Models in Type 2 Diabetes Prediction with RFE Feature Selection. Sinkron, 10(1), 352–360. https://doi.org/10.33395/sinkron.v10i1.15509
Benghazouani, S., Nouh, S., & Zakrani, A. (2025). Optimizing breast cancer diagnosis: Harnessing the power of nature-inspired metaheuristics for feature selection with soft voting classifiers. International Journal of Cognitive Computing in Engineering, 6, 1–20. https://doi.org/10.1016/j.ijcce.2024.09.005
Boateng, E. B., & Ampofo, A. G. (2023). A glimpse into the future: Modelling global prevalence of hypertension. BMC Public Health, 23(1), 1906. https://doi.org/10.1186/s12889-023-16662-z
Dhanka, S., & Maini, S. (2025). A hybridization of XGBoost Machine Learning model by Optuna hyperparameter tuning suite for cardiovascular disease classification with significant effect of outliers and heterogeneous training datasets. International Journal of Cardiology, 420, 132757. https://doi.org/10.1016/j.ijcard.2024.132757
Dong, T., Oronti, I. B., Sinha, S., Freitas, A., Zhai, B., Chan, J., Fudulu, D. P., Caputo, M., & Angelini, G. D. (2024). Enhancing Cardiovascular Risk Prediction: Development of an Advanced Xgboost Model with Hospital-Level Random Effects. Bioengineering, 11(10), 1039. https://doi.org/10.3390/bioengineering11101039
Jinbo, Z., Yufu, L., & Haitao, M. (2025). Handling missing data of using the XGBoost-based multiple imputation by chained equations regression method. Frontiers in Artificial Intelligence, 8, 1553220. https://doi.org/10.3389/frai.2025.1553220
Kario, K., Okura, A., Hoshide, S., & Mogi, M. (2024). The WHO Global report 2023 on hypertension warning the emerging hypertension burden in globe and its treatment strategy. Hypertension Research, 47(5), 1099–1102. https://doi.org/10.1038/s41440-024-01622-w
Mancia, G., Kreutz, R., Brunström, M., Burnier, M., Grassi, G., Januszewicz, A., Muiesan, M. L., Tsioufis, K., Agabiti-Rosei, E., Algharably, E. A. E., Azizi, M., Benetos, A., Borghi, C., Hitij, J. B., Cifkova, R., Coca, A., Cornelissen, V., Cruickshank, J. K., Cunha, P. G., … Kjeldsen, S. E. (2023). 2023 ESH Guidelines for the management of arterial hypertension The Task Force for the management of arterial hypertension of the European Society of Hypertension: Endorsed by the International Society of Hypertension (ISH) and the European Renal Association (ERA). Journal of Hypertension, 41(12), 1874–2071. https://doi.org/10.1097/HJH.0000000000003480
Maulana As’an Hamid & Egia Rosi Subhiyakto. (2025). Performance Comparison of Random Forest, SVM, and XGBoost Algorithms with SMOTE for Stunting Prediction. Journal of Applied Informatics and Computing, 9(4), 1163–1169. https://doi.org/10.30871/jaic.v9i4.9701
Mroz, T., Griffin, M., Cartabuke, R., Laffin, L., Russo-Alvarez, G., Thomas, G., Smedira, N., Meese, T., Shost, M., & Habboub, G. (2024). Predicting hypertension control using Machine Learning. PLOS ONE, 19(3), e0299932. https://doi.org/10.1371/journal.pone.0299932
Münzel, T., Crea, F., Rajagopalan, S., & Lüscher, T. (2025). Nicotine and the cardiovascular system: Unmasking a global public health threat. European Heart Journal, ehaf1010. https://doi.org/10.1093/eurheartj/ehaf1010
Narmilan, A., Gonzalez, F., Salgadoe, A., & Powell, K. (2022). Detection of White Leaf Disease in Sugarcane Using Machine Learning Techniques over UAV Multispectral Images. Drones, 6(9), 230. https://doi.org/10.3390/drones6090230
Pokharel, Y., Karmacharya, B. M., & Neupane, D. (2022). Hypertension—A Silent Killer Without Global Bounds. Journal of the American College of Cardiology, 80(8), 818–820. https://doi.org/10.1016/j.jacc.2022.05.043
Rayadhani, W. A., & Rahardi, M. (2025a). Comparative Analysis of Random Forest, SVM, and Naive Bayes for Cardiovascular Disease Prediction. Journal of Applied Informatics and Computing, 9(6), 3234–3243. https://doi.org/10.30871/jaic.v9i6.11451
Rayadhani, W. A., & Rahardi, M. (2025b). Comparative Analysis of Random Forest, SVM, and Naive Bayes for Cardiovascular Disease Prediction. Journal of Applied Informatics and Computing, 9(6), 3234–3243. https://doi.org/10.30871/jaic.v9i6.11451
Septian, E., Khaefi, M. R., Athoillah, A., Aisyah, D. N., Hardhantyo, M., Rahman, F. M., & Manikam, L. (2025). Prediction of Personalised Hypertension Using Machine Learning in Indonesian Population. Journal of Medical Systems, 49(1), 137. https://doi.org/10.1007/s10916-025-02253-5
Surur, M., Santoso, N. A., & Santoso, B. A. (2025). Klasifikasi Keterlambatan Pembayaran SPP Santri Menggunakan Algoritma K-Nearest Neighbor di Pesantren Al Fajar Tegal. RIGGS: Journal of Artificial Intelligence and Digital Business, 4(3), 873–883. https://doi.org/10.31004/riggs.v4i3.2117
Takase, M., Hirata, T., Nakaya, N., Kogure, M., Hatanaka, R., Nakaya, K., Chiba, I., Tokioka, S., Nochioka, K., Nakamura, T., Tsuchiya, N., Metoki, H., Satoh, M., Narita, A., Obara, T., Ishikuro, M., Ohseto, H., Takahashi, I., Kobayashi, T., … the ToMMo investigators. (2025). Associations of family history of hypertension, genetic, and lifestyle risks with incident hypertension. Hypertension Research, 48(10), 2606–2617. https://doi.org/10.1038/s41440-025-02314-9
Wang, W., Yan, L., Liu, F., & Li, Y. (2025). Improving Gaussian Naive Bayes classification on imbalanced data through coordinate-based minority feature mining. PeerJ Computer Science, 11, e3003. https://doi.org/10.7717/peerj-cs.3003
World Health Organization. (n.d.). Global report on hypertension: The race against a silent killer. Retrieved https://www.who.int/teams/noncommunicable-diseases/hypertension-report
World Health Organization. (2023). Hypertension—Fact Sheet [Fact Sheet / Webpage]. https://www.who.int/news-room/fact-sheets/detail/hypertension
Yadav, J., Nair, A. M., George, J., & Alapatt, B. P. (2024). Predictive Modelling of Heart Disease: Exploring Machine Learning Classification Algorithms. 2024 International Conference on Distributed Computing and Optimization Techniques (ICDCOT), 1–7. https://doi.org/10.1109/ICDCOT61034.2024.10515398
Yana Afrina, D. A. S. (2024). Determinant Kejadian Hipertensi: Studi Literatur Review. 3. https://journal.mandiracendikia.com/index.php/JIK-MC/article/view/814/634
Zhang, L. (2025). Features extraction based on Naive Bayes algorithm and TF-IDF for news classification. PLOS One, 20(7), e0327347. https://doi.org/10.1371/journal.pone.0327347
Zhao, H., Zhang, X., Xu, Y., Gao, L., Ma, Z., Sun, Y., & Wang, W. (2021). Predicting the Risk of Hypertension Based on Several Easy-to-Collect Risk Factors: A Machine Learning Method. Frontiers in Public Health, 9, 619429. https://doi.org/10.3389/fpubh.2021.619429
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Jurnal Ilmiah Teknik Informatika dan Komunikasi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.














