Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

12-2023

Abstract

In the national battle against COVID-19, harnessing population-level big data is imperative, enabling authorities to devise effective care policies, allocate healthcare resources efficiently, and enact targeted interventions. Singapore adopted the Home Recovery Programme (HRP) in September 2021, diverting low-risk COVID-19 patients to home care to ease hospital burdens amid high vaccination rates and mild symptoms. While a patient's suitability for HRP could be assessed using broad-based criteria, integrating machine learning (ML) model becomes invaluable for identifying high-risk patients prone to severe illness, facilitating early medical assessment. Most prior studies have traditionally depended on clinical and laboratory data, necessitating initial clinic or hospital evaluations. None of these studies incorporated vaccination status, a crucial variable in a well-vaccinated population. This paper proposes a machine learning approach to nationwide risk stratification, offering intervention recommendations by harnessing nationwide datasets. Our best-performing ML model, XGBoost achieves an AUROC of 0.930 utilizing data from multiple data sources including patients' demographic information, vaccination status and medical history. For broader applicability, we also propose a parsimonious XGBoost model with an AUROC of 0.885 with a selection of five commonly collected variables, namely age, number of vaccine doses taken and number of days since the first, second and booster doses. Importantly, both of our proposed models achieve robust predictive performance without requiring the collection of clinical or laboratory data from patients. We believe that the parsimonious model, leveraging easily attainable data, has the potential for broader adoption across diverse nations, ultimately delivering paramount value to their populations.

Keywords

Predictive analytics, Public health, Decision-support, Risk stratification, COVID-19 pandemic.

Discipline

Databases and Information Systems | Health Information Technology | Public Health

Research Areas

Data Science and Engineering

Publication

2023 IEEE International Conference on Big Data: Sorrento, Italy, December 15-18: Proceedings

First Page

1

Last Page

9

ISBN

9798350324457

Identifier

10.1109/BigData59044.2023.10386378

Publisher

IEEE

City or Country

Piscataway, NJ

Copyright Owner and License

Authors

Creative Commons License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.

Comments

Paper accepted in the "6th Special Session on Healthcare Data".

Additional URL

https://doi.org/10.1109/BigData59044.2023.10386378

Share

COinS