Analisis Performa XGBoost dan Gaussian Naive Bayes untuk Klasifikasi Dini Penyakit Hipertensi

Authors

  • Ni Made Ochiana Septhi Pratiwi Universitas Pendidikan Nasional
  • Adie Wahyudi Oktavia Gama Universitas Pendidikan Nasional

DOI:

https://doi.org/10.55606/juitik.v6i1.2119

Keywords:

Early Detection, Gaussian Naive Bayes, Hypertension, Machine Learning, XGBoost

Abstract

Hypertension is one of the leading causes of premature death globally that often goes undetected due to minimal clinical symptoms, earning it the nickname “silent killer.” The application of artificial intelligence (AI), particularly Machine Learning, is a strategic approach to early detection, but the main challenge lies in balancing diagnostic accuracy with detection sensitivity so that no patients at risk are overlooked. This study aims to analyze and compare the performance of the Extreme Gradient Boosting (XGBoost) algorithm with the Cost-Sensitive strategy compared to Gaussian Naive Bayes (GNB) as a baseline in hypertension risk classification. The dataset used included 1,985 electronic medical records with 9 clinical attributes, which were evaluated using the 10-Fold Cross-Validation method to determine model validity. The test results showed that XGBoost consistently outperformed GNB across all evaluation metrics. XGBoost recorded superior performance with an Accuracy of 92.19% and an AUC of 0.9752, far surpassing GNB, which obtained an Accuracy of 84.13%. The application of Cost-Sensitive Learning in XGBoost proved effective in overcoming performance trade-offs by producing a Recall of 91.26% and a Precision of 93.53%. Furthermore, Feature Importance analysis identified Blood Pressure History, Smoking Status, and Family History as the most dominant risk factors, which is in line with global medical guidelines. Based on these results, it is concluded that XGBoost is a more reliable and accurate method to be applied in early detection systems for hypertension compared to classical probabilistic approaches.

References

Afifah, K., Yulita, I. N., & Sarathan, I. (2021). Sentiment Analysis on Telemedicine App Reviews using XGBoost Classifier. 2021 International Conference on Artificial Intelligence and Big Data Analytics, 22–27. https://doi.org/10.1109/ICAIBDA53487.2021.9689735

Anbazhagan, T., & Rangaswamy, B. (2025). Early prediction of CKD from time series data using adaptive PSO optimized echo state networks. Scientific Reports, 15(1), 6966. https://doi.org/10.1038/s41598-025-91028-6

Badawy, M., Ramadan, N., & Hefny, H. A. (2024). Big data analytics in healthcare: Data sources, tools, challenges, and opportunities. Journal of Electrical Systems and Information Technology, 11(1), 63. https://doi.org/10.1186/s43067-024-00190-w

Barus, H. P., Robet, R., & Tarigan, F. A. (2026). Comparison of XGBoost and Naive Bayes Models in Type 2 Diabetes Prediction with RFE Feature Selection. Sinkron, 10(1), 352–360. https://doi.org/10.33395/sinkron.v10i1.15509

Benghazouani, S., Nouh, S., & Zakrani, A. (2025). Optimizing breast cancer diagnosis: Harnessing the power of nature-inspired metaheuristics for feature selection with soft voting classifiers. International Journal of Cognitive Computing in Engineering, 6, 1–20. https://doi.org/10.1016/j.ijcce.2024.09.005

Boateng, E. B., & Ampofo, A. G. (2023). A glimpse into the future: Modelling global prevalence of hypertension. BMC Public Health, 23(1), 1906. https://doi.org/10.1186/s12889-023-16662-z

Dhanka, S., & Maini, S. (2025). A hybridization of XGBoost Machine Learning model by Optuna hyperparameter tuning suite for cardiovascular disease classification with significant effect of outliers and heterogeneous training datasets. International Journal of Cardiology, 420, 132757. https://doi.org/10.1016/j.ijcard.2024.132757

Dong, T., Oronti, I. B., Sinha, S., Freitas, A., Zhai, B., Chan, J., Fudulu, D. P., Caputo, M., & Angelini, G. D. (2024). Enhancing Cardiovascular Risk Prediction: Development of an Advanced Xgboost Model with Hospital-Level Random Effects. Bioengineering, 11(10), 1039. https://doi.org/10.3390/bioengineering11101039

Jinbo, Z., Yufu, L., & Haitao, M. (2025). Handling missing data of using the XGBoost-based multiple imputation by chained equations regression method. Frontiers in Artificial Intelligence, 8, 1553220. https://doi.org/10.3389/frai.2025.1553220

Kario, K., Okura, A., Hoshide, S., & Mogi, M. (2024). The WHO Global report 2023 on hypertension warning the emerging hypertension burden in globe and its treatment strategy. Hypertension Research, 47(5), 1099–1102. https://doi.org/10.1038/s41440-024-01622-w

Mancia, G., Kreutz, R., Brunström, M., Burnier, M., Grassi, G., Januszewicz, A., Muiesan, M. L., Tsioufis, K., Agabiti-Rosei, E., Algharably, E. A. E., Azizi, M., Benetos, A., Borghi, C., Hitij, J. B., Cifkova, R., Coca, A., Cornelissen, V., Cruickshank, J. K., Cunha, P. G., … Kjeldsen, S. E. (2023). 2023 ESH Guidelines for the management of arterial hypertension The Task Force for the management of arterial hypertension of the European Society of Hypertension: Endorsed by the International Society of Hypertension (ISH) and the European Renal Association (ERA). Journal of Hypertension, 41(12), 1874–2071. https://doi.org/10.1097/HJH.0000000000003480

Maulana As’an Hamid & Egia Rosi Subhiyakto. (2025). Performance Comparison of Random Forest, SVM, and XGBoost Algorithms with SMOTE for Stunting Prediction. Journal of Applied Informatics and Computing, 9(4), 1163–1169. https://doi.org/10.30871/jaic.v9i4.9701

Mroz, T., Griffin, M., Cartabuke, R., Laffin, L., Russo-Alvarez, G., Thomas, G., Smedira, N., Meese, T., Shost, M., & Habboub, G. (2024). Predicting hypertension control using Machine Learning. PLOS ONE, 19(3), e0299932. https://doi.org/10.1371/journal.pone.0299932

Münzel, T., Crea, F., Rajagopalan, S., & Lüscher, T. (2025). Nicotine and the cardiovascular system: Unmasking a global public health threat. European Heart Journal, ehaf1010. https://doi.org/10.1093/eurheartj/ehaf1010

Narmilan, A., Gonzalez, F., Salgadoe, A., & Powell, K. (2022). Detection of White Leaf Disease in Sugarcane Using Machine Learning Techniques over UAV Multispectral Images. Drones, 6(9), 230. https://doi.org/10.3390/drones6090230

Pokharel, Y., Karmacharya, B. M., & Neupane, D. (2022). Hypertension—A Silent Killer Without Global Bounds. Journal of the American College of Cardiology, 80(8), 818–820. https://doi.org/10.1016/j.jacc.2022.05.043

Rayadhani, W. A., & Rahardi, M. (2025a). Comparative Analysis of Random Forest, SVM, and Naive Bayes for Cardiovascular Disease Prediction. Journal of Applied Informatics and Computing, 9(6), 3234–3243. https://doi.org/10.30871/jaic.v9i6.11451

Rayadhani, W. A., & Rahardi, M. (2025b). Comparative Analysis of Random Forest, SVM, and Naive Bayes for Cardiovascular Disease Prediction. Journal of Applied Informatics and Computing, 9(6), 3234–3243. https://doi.org/10.30871/jaic.v9i6.11451

Septian, E., Khaefi, M. R., Athoillah, A., Aisyah, D. N., Hardhantyo, M., Rahman, F. M., & Manikam, L. (2025). Prediction of Personalised Hypertension Using Machine Learning in Indonesian Population. Journal of Medical Systems, 49(1), 137. https://doi.org/10.1007/s10916-025-02253-5

Surur, M., Santoso, N. A., & Santoso, B. A. (2025). Klasifikasi Keterlambatan Pembayaran SPP Santri Menggunakan Algoritma K-Nearest Neighbor di Pesantren Al Fajar Tegal. RIGGS: Journal of Artificial Intelligence and Digital Business, 4(3), 873–883. https://doi.org/10.31004/riggs.v4i3.2117

Takase, M., Hirata, T., Nakaya, N., Kogure, M., Hatanaka, R., Nakaya, K., Chiba, I., Tokioka, S., Nochioka, K., Nakamura, T., Tsuchiya, N., Metoki, H., Satoh, M., Narita, A., Obara, T., Ishikuro, M., Ohseto, H., Takahashi, I., Kobayashi, T., … the ToMMo investigators. (2025). Associations of family history of hypertension, genetic, and lifestyle risks with incident hypertension. Hypertension Research, 48(10), 2606–2617. https://doi.org/10.1038/s41440-025-02314-9

Wang, W., Yan, L., Liu, F., & Li, Y. (2025). Improving Gaussian Naive Bayes classification on imbalanced data through coordinate-based minority feature mining. PeerJ Computer Science, 11, e3003. https://doi.org/10.7717/peerj-cs.3003

World Health Organization. (n.d.). Global report on hypertension: The race against a silent killer. Retrieved https://www.who.int/teams/noncommunicable-diseases/hypertension-report

World Health Organization. (2023). Hypertension—Fact Sheet [Fact Sheet / Webpage]. https://www.who.int/news-room/fact-sheets/detail/hypertension

Yadav, J., Nair, A. M., George, J., & Alapatt, B. P. (2024). Predictive Modelling of Heart Disease: Exploring Machine Learning Classification Algorithms. 2024 International Conference on Distributed Computing and Optimization Techniques (ICDCOT), 1–7. https://doi.org/10.1109/ICDCOT61034.2024.10515398

Yana Afrina, D. A. S. (2024). Determinant Kejadian Hipertensi: Studi Literatur Review. 3. https://journal.mandiracendikia.com/index.php/JIK-MC/article/view/814/634

Zhang, L. (2025). Features extraction based on Naive Bayes algorithm and TF-IDF for news classification. PLOS One, 20(7), e0327347. https://doi.org/10.1371/journal.pone.0327347

Zhao, H., Zhang, X., Xu, Y., Gao, L., Ma, Z., Sun, Y., & Wang, W. (2021). Predicting the Risk of Hypertension Based on Several Easy-to-Collect Risk Factors: A Machine Learning Method. Frontiers in Public Health, 9, 619429. https://doi.org/10.3389/fpubh.2021.619429

Downloads

Published

2026-02-14

How to Cite

Ni Made Ochiana Septhi Pratiwi, & Adie Wahyudi Oktavia Gama. (2026). Analisis Performa XGBoost dan Gaussian Naive Bayes untuk Klasifikasi Dini Penyakit Hipertensi. Jurnal Ilmiah Teknik Informatika Dan Komunikasi, 6(1), 549–563. https://doi.org/10.55606/juitik.v6i1.2119

Similar Articles

<< < 1 2 3 4 5 6 

You may also start an advanced similarity search for this article.