Title | Demonstration of Smoke: A Deep Breath of Data-Intensive Lineage Applications |
Publication Type | Conference Paper |
Year of Publication | 2018 |
Authors | Psallidas, Fotis, Wu, Eugene |
Conference Name | Proceedings of the 2018 International Conference on Management of Data |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-4703-7 |
Keywords | composability, Databases, Human Behavior, interactive visualizations, Metrics, Provenance, pubcrawl, Resiliency |
Abstract | Data lineage is a fundamental type of information that describes the relationships between input and output data items in a workflow. As such, an immense amount of data-intensive applications with logic over the input-output relationships can be expressed declaratively in lineage terms. Unfortunately, many applications resort to hand-tuned implementations because either lineage systems are not fast enough to meet their requirements or due to no knowledge of the lineage capabilities. Recently, we introduced a set of implementation design principles and associated techniques to optimize lineage-enabled database engines and realized them in our prototype database engine, namely, Smoke. In this demonstration, we showcase lineage as the building block across a variety of data-intensive applications, including tooltips and details on demand; crossfilter; and data profiling. In addition, we show how Smoke outperforms alternative lineage systems to meet or improve on existing hand-tuned implementations of these applications. |
URL | http://doi.acm.org/10.1145/3183713.3193537 |
DOI | 10.1145/3183713.3193537 |
Citation Key | psallidas_demonstration_2018 |