Using applied machine learning to predict healthcare utilization based on socioeconomic determinants of care
Am J Manag Care
Objectives: To determine if it is possible to risk-stratify avoidable utilization without clinical data and with limited patient-level data. STUDY DESIGN: The aim of this study was to demonstrate the influences of socioeconomic determinants of health (SDH) with regard to avoidable patient-level healthcare utilization. The study investigated the ability of machine learning models to predict risk using only publicly available and purchasable SDH data. A total of 138,115 patients were analyzed from a deidentified database representing 3 health systems in the United States. METHODS: A hold-out methodology was used to ensure that the model's performance could be tested on a completely independent set of subjects. A proprietary decision tree methodology was used to make the predictions. Only the socioeconomic features-age group, gender, and race-were used in the prediction of a patient's risk of admission. RESULTS: The decision tree-based machine learning approach analyzed in this study was able to predict inpatient and emergency department utilization with a high degree of discrimination using only purchasable and publicly available data on SDH. CONCLUSIONS: This study indicates that it is possible to risk-stratify patients' risk of utilization without interacting with the patient or collecting information beyond the patient's age, gender, race, and address. The implications of this application are wide and have the potential to positively affect health systems by facilitating targeted patient outreach with specific, individualized interventions to tackle detrimental SDH at not only the individual level but also the neighborhood level.
Chen S, Bergman D, Miller K, Kavanagh A, Frownfelter J, Showalter J. Using applied machine learning to predict healthcare utilization based on socioeconomic determinants of care. Am J Manag Care. 2020;26(1):26–31. PMID: 31951356. Available online.