Assessing performance of ZCTA-level and Census Tract-level social and environmental risk factors in a model predicting hospital events





Citation of Original Publication

Goetschius, Leigh G., et al. "Assessing performance of ZCTA-level and Census Tract-level social and environmental risk factors in a model predicting hospital events" Social Science & Medicine 326, 115943 (06 May, 2023).


This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Access to this item will begin on 05/06/2026.



Predictive analytics are used in primary care to efficiently direct health care resources to high-risk patients to prevent unnecessary health care utilization and improve health. Social determinants of health (SDOH) are important features in these models, but they are poorly measured in administrative claims data. Area-level SDOH can be proxies for unavailable individual-level indicators, but the extent to which the granularity of risk factors impacts predictive models is unclear. We examined whether increasing the granularity of area-based SDOH features from ZIP code tabulation area (ZCTA) to Census Tract strengthened an existing clinical prediction model for avoidable hospitalizations (AH events) in Maryland Medicare fee-for-service beneficiaries. We created a person-month dataset for 465,749 beneficiaries (59.4% female; 69.8% White; 22.7% Black) with 144 features indexing medical history and demographics using Medicare claims (September 2018 through July 2021). Claims data were linked with 37 SDOH features associated with AH events from 11 publicly-available sources (e.g., American Community Survey) based on the beneficiaries’ ZCTA and Census Tract of residence. Individual AH risk was estimated using six discrete time survival models with different combinations of demographic, condition/utilization, and SDOH features. Each model used stepwise variable selection to retain only meaningful predictors. We compared model fit, predictive performance, and interpretation across models. Results showed that increasing the granularity of area-based risk factors did not dramatically improve model fit or predictive performance. However, it did affect model interpretation by altering which SDOH features were retained during variable selection. Further, the inclusion of SDOH at either granularity level meaningfully reduced the risk that was attributed to demographic predictors (e.g., race, dual-eligibility for Medicaid). Differences in interpretation are critical given that this model is used by primary care staff to inform the allocation of care management resources, including those available to address drivers of health beyond the bounds of traditional health care.