Segmentasi Pasien Berbasis K-Means dari Tanda-tanda Vital dan Demografi : Pendekatan Unsupervised Learning untuk Profil Risiko Klinis
DOI:
https://doi.org/10.55606/juisik.v5i3.1811Keywords:
Clinical Risk Profile, K-Means Clustering, Patient Segmentation, Unsupervised Learning, Vital SignsAbstract
Digital transformation in the health system demands the use of clinical data more strategically to support evidence-based decision-making. This study aims to explore the application of the K-Means Clustering algorithm in patient segmentation based on a combination of vital signs (systolic and diastolic blood pressure) and demographic characteristics (age, weight, and gender). Data on 1,401 outpatients was obtained from the medical record system of a hospital in Indonesia, then processed through the stages of preprocessing, standardization, and dimensionality reduction using PCA. The results of the elbow method showed that the optimal number of clusters was 3 (k=3). Descriptive analysis showed that Cluster 0 consisted of 100% women with normal blood pressure (124/77 mmHg) and an average body weight of 55.6 kg; Cluster 1 consists of the majority of women with high blood pressure (160.8/98.8 mmHg); while Cluster 2 includes 100% of men with blood pressure leading to pre-hypertension (130.1/80.7 mmHg). PCA visualizations show fairly clear cluster separation, with Cluster 1 having the most clinically distinct characteristics. The conclusion of this study is that the K-Means-based unsupervised learning approach is effective in identifying latent risk patterns in patient populations, as well as the potential to support clinical risk mapping and preventive health policies. Future recommendations include the integration of this method in EMR systems and the expansion of studies on national datasets.
References
Abdullah, S. S., Rostamzadeh, N., Sedig, K., Garg, A. X., & McArthur, E. (2020). Visual analytics for dimension reduction and cluster analysis of high dimensional electronic health records. Informatics, 7(2). https://doi.org/10.3390/informatics7020017
Chakraborty, C., Bhattacharya, M., Pal, S., & Lee, S. S. (2024). From machine learning to deep learning: Advances of the recent data-driven paradigm shift in medicine and healthcare. Current Research in Biotechnology, 7. Elsevier B.V. https://doi.org/10.1016/j.crbiot.2023.100164
Ikotun, A. M., Ezugwu, A. E., Abualigah, L., Abuhaija, B., & Heming, J. (2023). K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data. Information Sciences, 622, 178–210. https://doi.org/10.1016/j.ins.2022.11.139
Jia, L., Gaüzère, B., & Honeine, P. (2021). graphkit-learn: A Python library for graph kernels based on linear patterns. Pattern Recognition Letters, 143, 113–121. https://doi.org/10.1016/j.patrec.2021.01.003
Junaid, S. B., Imam, A. A., Shuaibu, A. N., Basri, S., Kumar, G., Surakat, Y. A., Balogun, A. O., Abdulkarim, M., Garba, A., Sahalu, Y., Mohammed, A., Mohammed, Y. T., Abdulkadir, B. A., Abba, A. A., Kakumi, N. A. I., & Alazzawi, A. K. (2022). Artificial intelligence, sensors and vital health signs: A review. Applied Sciences, 12(22). https://doi.org/10.3390/app122211475
Kementerian Kesehatan Republik Indonesia. (2021). Cetak Biru Strategi Transformasi Digital Kesehatan 2024 (1st ed.). Kementerian Kesehatan RI.
Khanam, F. T. Z., Al-Naji, A., & Chahl, J. (2019). Remote monitoring of vital signs in diverse non-clinical and clinical scenarios using computer vision systems: A review. Applied Sciences, 9(20). https://doi.org/10.3390/app9204474
Liu, Z., Hu, Y., Mertes, G., Yang, Y., & Clifton, D. A. (2022). Patient clustering and classification for vital organ failure using ICD code with graph attention. bioRxiv. https://doi.org/10.1101/2022.11.07.515209
Mariam, A., Javidi, H., Zabor, E. C., Zhao, R., Radivoyevitch, T., & Rotroff, D. M. (2024). Unsupervised clustering of longitudinal clinical measurements in electronic health records. PLOS Digital Health, 3(10). https://doi.org/10.1371/journal.pdig.0000628
Molokomme, D. N., Chabalala, C. S., & Bokoro, P. N. (2021). Enhancement of advanced metering infrastructure performance using unsupervised k-means clustering algorithm. Energies, 14(9). https://doi.org/10.3390/en14092732
Reza, N., Yang, Y., Bone, W. P., Singhal, P., Verma, A., Denduluri, S., Adusumalli, S., Ritchie, M. D., & Cappola, T. P. (2022). Unsupervised clustering applied to electronic health record-derived phenotypes in patients with heart failure. medRxiv. https://doi.org/10.1101/2022.10.31.22281772
United Nations Department of Global Communications. (2023). The 17 Sustainable Development Goals 2030.
Viveka Kesanapalli, L., & Rao Chintalapudi, S. (2021). A survey on machine learning applications in healthcare. Advances and Applications in Mathematical Sciences, 20(11).
Wang, F., Jiao, L., & Pan, Q. (2021). A survey on unsupervised transfer clustering. In 2021 40th Chinese Control Conference (CCC) (pp. 7361–7365). https://doi.org/10.23919/CCC52363.2021.9549617
Wang, L., Tong, L., Davis, D., Arnold, T., & Esposito, T. (2020). The application of unsupervised deep learning in predictive models using electronic health records. BMC Medical Research Methodology, 20(1). https://doi.org/10.1186/s12874-020-00923-1
Xie, L., Gou, B., Bai, S., Yang, D., Zhang, Z., Di, X., Su, C., Wang, X., Wang, K., & Zhang, J. (2023). Unsupervised cluster analysis reveals distinct subgroups in healthy population with different exercise responses of cardiorespiratory fitness. Journal of Exercise Science and Fitness, 21(1), 147–156. https://doi.org/10.1016/j.jesf.2022.12.005
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Jurnal ilmiah Sistem Informasi dan Ilmu Komputer

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.







