Using natural language processing to study homelessness longitudinally with electronic health record data subject to irregular observations
AMIA Annu Symp Proc
The Electronic Health Record (EHR) contains information about social determinants of health (SDoH) such as homelessness. Much of this information is contained in clinical notes and can be extracted using natural language processing (NLP). This data can provide valuable information for researchers and policymakers studying long-term housing outcomes for individuals with a history of homelessness. However, studying homelessness longitudinally in the EHR is challenging due to irregular observation times. In this work, we applied an NLP system to extract housing status for a cohort of patients in the US Department of Veterans Affairs (VA) over a three-year period. We then applied inverse intensity weighting to adjust for the irregularity of observations, which was used generalized estimating equations to estimate the probability of unstable housing each day after entering a VA housing assistance program. Our methods generate unique insights into the long-term outcomes of individuals with a history of homelessness and demonstrate the potential for using EHR data for research and policymaking.
Chapman AB, Scharfstein DO, Montgomery AE, et al. Using natural language processing to study homelessness longitudinally with electronic health record data subject to irregular observations. AMIA Annu Symp Proc. 2024;2023:894-903. PMID: 38222404