Utilizing Latent Space Representation for Disease Phenotyping and Patient Risk Stratification
Loading...
Links to Files
Permanent Link
Author/Creator
Author/Creator ORCID
Date
2025-04-25
Type of Work
Department
Hood College Computer Science and Information Technology
Program
Hood College Departmental Honors
Citation of Original Publication
Rights
Abstract
Obstructive sleep apnea (OSA) is a common sleep-related disorder characterized by intermittent breathing pauses during sleep, which can significantly increase the risk of cardiovascular and metabolic diseases. The often undiagnosed nature of OSA, coupled with the difficulty in identifying patients most at risk for associated comorbidities, has led to sub-optimal personalized patient care. While previous studies have established a correlation between OSA and various comorbidities, the complexity and inconsistency of clinical data in electronic health records (EHR) pose challenges in deriving reliable results in healthcare studies. In this paper, we extracted and compared learned latent spaces-- a compressed representation of input data used to uncover hidden patterns-- using methods such as such as Autoencoders, Uniform Manifold Approximation and Projection (UMAP) and Principal Component Analysis (PCA) to filter out the noise and irrelevant details from the EHR data. We then deep phenotyped OSA patients through unsupervised clustering using the latent representation, identified patient subgroups and uncover potential risk factors that drive subgroup differentiation, and developed a clinical tool to predict patient group assignment via supervised learning. These findings enhance the understanding of OSA deep phenotyping and improve patient comorbidity risk assessment.