SQL-like big data environments: Case study in clinical trial analytics

dc.contributor.authorGrover, Akshay
dc.contributor.authorGholap, Jay
dc.contributor.authorJaneja, Vandana
dc.contributor.authorYesha, Yelena
dc.contributor.authorChintalapati, Raghu
dc.contributor.authorMarwaha, Harsh
dc.contributor.authorModi, Kunal
dc.date.accessioned2018-10-31T18:44:36Z
dc.date.available2018-10-31T18:44:36Z
dc.date.issued2015-12-28
dc.descriptionIEEE International Conference on Big Dataen_US
dc.description.abstractBig Data deals with enormous volumes of complex and exponentially growing data sets from multiple sources. With rapid growth in technology, we are now able to generate immense amount of data in almost any field imaginable including physical, biological and biomedical sciences. With the diversity and amount of data in health care industry there is an increasing need to evaluate the components in big data frameworks and gauge their adaptability to analytics techniques. However, a key step in adapting big data tools is the portability of relational databases to big data environment. Since SQL is considered to be the de-facto language for interactive queries, in this paper, we evaluate the performance of SQL-like big data solutions for the portability of existing relational databases. Our work focuses on benchmarking multiple SQL-like big data technologies over Hadoop based distributed file system (HDFS) for Study Data Tabulation Model (SDTM) used in clinical trial databases for improving the efficiency of research in clinical trials. We use publically available clinical trial data (from National Institute on Drug Abuse (NIDA)), which follows SDTM, as a test bed to measure key parameters like usability, adaptability, modularity, robustness and efficiency of these solutions. With the intention to demonstrate how current clinical trial functionality can be replicated on a big data backend with high SQL-like functionality, we evaluate several types of ad-hoc SQL queries.en_US
dc.description.urihttps://ieeexplore.ieee.org/document/7364068en_US
dc.format.extent10 pagesen_US
dc.genreconference papers and proceedings pre-printen_US
dc.identifierdoi:10.13016/M2BZ61C6X
dc.identifier.citationAkshay Grover, Jay Gholap, Vandana P Janeja, Yelena Yesha, Raghu Chintalapati, Harsh Marwaha, and Kunal Modi, SQL-like big data environments: Case study in clinical trial analytics, DOI: 10.1109/BigData.2015.7364068en_US
dc.identifier.uri10.1109/BigData.2015.7364068
dc.identifier.urihttp://hdl.handle.net/11603/11811
dc.language.isoen_USen_US
dc.publisherIEEEen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rights© 2015 IEEE
dc.subjectBig Dataen_US
dc.subjectBenchmarkingen_US
dc.subjectHadoop based distributed file system (HDFS)en_US
dc.subjectSQL- Likeen_US
dc.subjectUMBC Ebiquity Research Groupen_US
dc.titleSQL-like big data environments: Case study in clinical trial analyticsen_US
dc.typeTexten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
808.pd.pdf
Size:
1.89 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.68 KB
Format:
Item-specific license agreed upon to submission
Description: