Analysis of Irregular Event Sequences in Healthcare using Deep Learning, Reinforcement Learning, and Visualization

Author/Creator

Author/Creator ORCID

Date

2020-01-20

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

Distribution Rights granted to UMBC by the author.
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Abstract

Each year over 880 million doctor visits occur in the United States. Current estimates place the yearly healthcare data records at 2,314 exabytes with projections of reaching zettabytes and yottabytes. For a country with 329 million people that spend $4 trillion on healthcare each year, understanding and uncovering the hidden patterns and trends within this big data is paramount in improving healthcare outcomes through preventative care and early diagnosis, creating a more efficient healthcare system with optimizations, and making clinicians' and patients' lives easier. With the ever increasing availability of big data, these problems are more important now than ever before. While many event analysis and time series tools have been developed for the purpose of analyzing such datasets, most approaches tend to target clean and evenly spaced data (i.e., with a fixed time interval between observations). When faced with noisy or irregular data, it is typical to use a preprocessing step of transforming the data into being regular. This transformation technique arguably interferes on a fundamental level as to how the data is represented, and may irrevocably bias the way in which results are obtained. Therefore, operating on raw data, in its noisy natural form, is necessary to ensure that the insights gathered through analysis are accurate and valid. In this dissertations novel approaches are presented for analyzing irregular event sequences using a variety of techniques ranging from deep learning, reinforcement learning, and visualization. We show how common tasks in event analysis can be performed directly on an irregular event dataset without requiring a transformation that alters the natural representation of the process that the data was captured from. We focus our efforts on healthcare specifically, but also evaluate our approaches against other domains to test for generalizability. The three tasks that we showcase include: (i) predicting the probability of a future event occurring, (ii) summarizing large event datasets, and (iii) modeling the processes that create events.