Synthetic Time Series Data Generation for Healthcare Applications: A PCG Case Study

dc.contributor.authorJamshidi, Ainaz
dc.contributor.authorArif, Muhammad
dc.contributor.authorKalhoro, Sabir Ali
dc.contributor.authorGelbukh, Alexander
dc.date.accessioned2025-01-31T18:24:18Z
dc.date.available2025-01-31T18:24:18Z
dc.date.issued2024-12-17
dc.description.abstractThe generation of high-quality medical time series data is essential for advancing healthcare diagnostics and safeguarding patient privacy. Specifically, synthesizing realistic phonocardiogram (PCG) signals offers significant potential as a cost-effective and efficient tool for cardiac disease pre-screening. Despite its potential, the synthesis of PCG signals for this specific application received limited attention in research. In this study, we employ and compare three state-of-the-art generative models from different categories - WaveNet, DoppelGANger, and DiffWave - to generate high-quality PCG data. We use data from the George B. Moody PhysioNet Challenge 2022. Our methods are evaluated using various metrics widely used in the previous literature in the domain of time series data generation, such as mean absolute error and maximum mean discrepancy. Our results demonstrate that the generated PCG data closely resembles the original datasets, indicating the effectiveness of our generative models in producing realistic synthetic PCG data. In our future work, we plan to incorporate this method into a data augmentation pipeline to synthesize abnormal PCG signals with heart murmurs, in order to address the current scarcity of abnormal data. We hope to improve the robustness and accuracy of diagnostic tools in cardiology, enhancing their effectiveness in detecting heart murmurs.
dc.description.urihttp://arxiv.org/abs/2412.16207
dc.format.extent7 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m2jno9-vds2
dc.identifier.urihttps://doi.org/10.48550/arXiv.2412.16207
dc.identifier.urihttp://hdl.handle.net/11603/37585
dc.language.isoen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department
dc.relation.ispartofUMBC Student Collection
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectElectrical Engineering and Systems Science - Signal Processing
dc.subjectComputer Science - Computational Engineering, Finance, and Science
dc.subjectUMBC Emerging Software Technologies Lab
dc.subjectComputer Science - Machine Learning
dc.titleSynthetic Time Series Data Generation for Healthcare Applications: A PCG Case Study
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0002-7342-3982

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2412.16207v1.pdf
Size:
834.87 KB
Format:
Adobe Portable Document Format