Visible to the public SQLShare: Results from a Multi-Year SQL-as-a-Service Experiment

TitleSQLShare: Results from a Multi-Year SQL-as-a-Service Experiment
Publication TypeConference Paper
Year of Publication2016
AuthorsJain, Shrainik, Moritz, Dominik, Halperin, Daniel, Howe, Bill, Lazowska, Ed
Conference NameProceedings of the 2016 International Conference on Management of Data
Date PublishedJune 2016
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-3531-7
Keywordsdatabase management as a cloud service, database management sytems, pubcrawl170201, relational databases
Abstract

We analyze the workload from a multi-year deployment of a database-as-a-service platform targeting scientists and data scientists with minimal database experience. Our hypothesis was that relatively minor changes to the way databases are delivered can increase their use in ad hoc analysis environments. The web-based SQLShare system emphasizes easy dataset-at-a-time ingest, relaxed schemas and schema inference, easy view creation and sharing, and full SQL support. We find that these features have helped attract workloads typically associated with scripts and files rather than relational databases: complex analytics, routine processing pipelines, data publishing, and collaborative analysis. Quantitatively, these workloads are characterized by shorter dataset "lifetimes", higher query complexity, and higher data complexity. We report on usage scenarios that suggest SQL is being used in place of scripts for one-off data analysis and ad hoc data sharing. The workload suggests that a new class of relational systems emphasizing short-term, ad hoc analytics over engineered schemas may improve uptake of database technology in data science contexts. Our contributions include a system design for delivering databases into these contexts, a description of a public research query workload dataset released to advance research in analytic data systems, and an initial analysis of the workload that provides evidence of new use cases under-supported in existing systems.

URLhttps://dl.acm.org/doi/10.1145/2882903.2882957
DOI10.1145/2882903.2882957
Citation Keyjain_sqlshare:_2016