Case Studies from the Real World: The Importance of Measurement and Analysis in Building Better Systems
Title | Case Studies from the Real World: The Importance of Measurement and Analysis in Building Better Systems |
Publication Type | Conference Paper |
Year of Publication | 2016 |
Authors | Schroeder, Bianca |
Conference Name | Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-4080-9 |
Keywords | field studies, Human Behavior, Large-scale systems, Measurement, Metrics, multiple fault diagnosis, pubcrawl, reliability, Resiliency |
Abstract | At the core of the "Big Data" revolution lie frameworks and systems that allow for the massively parallel processing of large amounts of data. Ironically, while they have been designed for processing large amounts of data, these systems are at the same time major producers of data: to support the administration and management of these huge-scale systems, they are configured to generate detailed log and monitoring data, periodically capturing the system state across all nodes, components and jobs in the system. While such logging information is used routinely by sysadmins for ad-hoc trouble-shooting and problem diagnosis, we point out that there is a tremendous value in analyzing such data from a research point of view. In this talk, we will go over several case studies that demonstrate how measuring and analyzing measurement data from production systems can provide new insights into how systems work and fail, and how these new insights can help in designing better systems. |
URL | http://doi.acm.org/10.1145/2851553.2858660 |
DOI | 10.1145/2851553.2858660 |
Citation Key | schroeder_case_2016 |