Back to Evidence & Resource Library

Using natural language processing to study homelessness longitudinally with electronic health record data subject to irregular observations

Chapman AB, Scharfstein DO, Montgomery AE, Byrne T, Suo Y, Effiong A, Velasquez T, Pettey W, Nelson RE
AMIA Annu Symp Proc

The Electronic Health Record (EHR) contains information about social determinants of health (SDoH) such as homelessness. Much of this information is contained in clinical notes and can be extracted using natural language processing (NLP). This data can provide valuable information for researchers and policymakers studying long-term housing outcomes for individuals with a history of homelessness. However, studying homelessness longitudinally in the EHR is challenging due to irregular observation times. In this work, we applied an NLP system to extract housing status for a cohort of patients in the US Department of Veterans Affairs (VA) over a three-year period. We then applied inverse intensity weighting to adjust for the irregularity of observations, which was used generalized estimating equations to estimate the probability of unstable housing each day after entering a VA housing assistance program. Our methods generate unique insights into the long-term outcomes of individuals with a history of homelessness and demonstrate the potential for using EHR data for research and policymaking.

Chapman AB, Scharfstein DO, Montgomery AE, et al. Using natural language processing to study homelessness longitudinally with electronic health record data subject to irregular observations. AMIA Annu Symp Proc. 2024;2023:894-903. PMID: 38222404

View the Resource
Publication year
Resource type
Peer Reviewed Research
Outcomes
Process
Population
Veterans
Social Determinant of Health
Housing Stability
Study design
Other Study Design
Keywords