Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
6-2008
Abstract
While data provenance is a well-studied topic in both database and workflow systems, its support within stream processing systems presents a new set of challenges. Part of the challenge is the high stream event rate and the low processing latency requirements imposed by many streaming applications. For example, emerging streaming applications in healthcare or finance call for data provenance, as illustrated in the Century stream processing infrastructure that we are building for supporting online healthcare analytics. At anytime, given an output data element (e.g., a medical alert) generated by Century, the system must be able to retrieve the input and intermediate data elements that led to its generation. In this paper, we describe the requirements behind our initial implementation of Century’s provenance subsystem. We then analyze its strengths and limitations and propose a new provenance architecture to address some of these limitations. The paper also includes a discussion on the open challenges in this area.
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
Provenance and Annotation of Data and Processes: Second International Provenance and Annotation Workshop, IPAW 2008, Salt Lake City, UT, June 17-18, 2008: Revised Selected Papers
Volume
5272
First Page
253
Last Page
265
ISBN
9783540899648
Identifier
10.1007/978-3-540-89965-5_26
Publisher
Springer
City or Country
Berlin
Citation
MISRA, Archan; BLOUNT, Marion; KEMENTSIETSIDIS, Anastasios; SOW, Daby; and WANG, Min.
Advances and Challenges for Scalable Provenance in Stream Processing Systems. (2008). Provenance and Annotation of Data and Processes: Second International Provenance and Annotation Workshop, IPAW 2008, Salt Lake City, UT, June 17-18, 2008: Revised Selected Papers. 5272, 253-265.
Available at: https://ink.library.smu.edu.sg/sis_research/678
Copyright Owner and License
Publisher
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/978-3-540-89965-5_26