Visible to the public Case Studies from the Real World: The Importance of Measurement and Analysis in Building Better Systems

TitleCase Studies from the Real World: The Importance of Measurement and Analysis in Building Better Systems
Publication TypeConference Paper
Year of Publication2016
AuthorsSchroeder, Bianca
Conference NameProceedings of the 7th ACM/SPEC on International Conference on Performance Engineering
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4080-9
Keywordsfield studies, Human Behavior, Large-scale systems, Measurement, Metrics, multiple fault diagnosis, pubcrawl, reliability, Resiliency
Abstract

At the core of the "Big Data" revolution lie frameworks and systems that allow for the massively parallel processing of large amounts of data. Ironically, while they have been designed for processing large amounts of data, these systems are at the same time major producers of data: to support the administration and management of these huge-scale systems, they are configured to generate detailed log and monitoring data, periodically capturing the system state across all nodes, components and jobs in the system. While such logging information is used routinely by sysadmins for ad-hoc trouble-shooting and problem diagnosis, we point out that there is a tremendous value in analyzing such data from a research point of view. In this talk, we will go over several case studies that demonstrate how measuring and analyzing measurement data from production systems can provide new insights into how systems work and fail, and how these new insights can help in designing better systems.

URLhttp://doi.acm.org/10.1145/2851553.2858660
DOI10.1145/2851553.2858660
Citation Keyschroeder_case_2016