Big data provenance: Challenges, state of the art and opportunities
dc.contributor.author | Wang, Jianwu | |
dc.contributor.author | Crawl, Daniel | |
dc.contributor.author | Purawat, Shweta | |
dc.contributor.author | Nguyen, Mai | |
dc.contributor.author | Altintas, Ilkay | |
dc.date.accessioned | 2024-02-14T16:21:41Z | |
dc.date.available | 2024-02-14T16:21:41Z | |
dc.date.issued | 2015-12-28 | |
dc.description | 2015 IEEE International Conference on Big Data 29 October 2015 - 01 November 2015 Santa Clara, CA, USA | |
dc.description.abstract | Ability to track provenance is a key feature of scientific workflows to support data lineage and reproducibility. The challenges that are introduced by the volume, variety and velocity of Big Data, also pose related challenges for provenance and quality of Big Data, defined as veracity. The increasing size and variety of distributed Big Data provenance information bring new technical challenges and opportunities throughout the provenance lifecycle including recording, querying, sharing and utilization. This paper discusses the challenges and opportunities of Big Data provenance related to the veracity of the datasets themselves and the provenance of the analytical processes that analyze these datasets. It also explains our current efforts towards tracking and utilizing Big Data provenance using workflows as a programming model to analyze Big Data. | |
dc.description.sponsorship | This work is partially supported by NSF DBI 1062565 and 1331615, NIH P41 GM103426 for NBCR and R25GM114821 for BBDTC, and DOE DE-SC0012630 for IPPD. | |
dc.description.uri | https://ieeexplore.ieee.org/document/7364047 | |
dc.format.extent | 8 pages | |
dc.genre | conference papers and proceedings | |
dc.genre | preprints | |
dc.identifier | doi:10.13016/m2q4w9-011v | |
dc.identifier.citation | J. Wang, D. Crawl, S. Purawat, M. Nguyen and I. Altintas, "Big data provenance: Challenges, state of the art and opportunities," 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 2015, pp. 2509-2516, doi: 10.1109/BigData.2015.7364047. | |
dc.identifier.uri | https://doi.org/10.1109/BigData.2015.7364047 | |
dc.identifier.uri | http://hdl.handle.net/11603/31616 | |
dc.language.iso | en_US | |
dc.publisher | IEEE | |
dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
dc.relation.ispartof | UMBC Information Systems Department Collection | |
dc.relation.ispartof | UMBC Faculty Collection | |
dc.relation.ispartof | UMBC Center for Accelerated Real Time Analysis | |
dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department | |
dc.relation.ispartof | UMBC Data Science | |
dc.relation.ispartof | UMBC Joint Center for Earth Systems Technology (JCET) | |
dc.relation.ispartof | UMBC Center for Real-time Distributed Sensing and Autonomy | |
dc.relation.ispartofseries | UMBC Center for Real-time Distributed Sensing and Autonomy | |
dc.rights | © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | |
dc.subject | UMBC Big Data Analytics Lab | |
dc.title | Big data provenance: Challenges, state of the art and opportunities | |
dc.type | Text | |
dcterms.creator | https://orcid.org/0000-0002-9933-1170 |