Visible to the public Biblio

Filters: Author is Townend, P.  [Clear All Filters]
2020-12-01
Yang, R., Ouyang, X., Chen, Y., Townend, P., Xu, J..  2018.  Intelligent Resource Scheduling at Scale: A Machine Learning Perspective. 2018 IEEE Symposium on Service-Oriented System Engineering (SOSE). :132—141.

Resource scheduling in a computing system addresses the problem of packing tasks with multi-dimensional resource requirements and non-functional constraints. The exhibited heterogeneity of workload and server characteristics in Cloud-scale or Internet-scale systems is adding further complexity and new challenges to the problem. Compared with,,,, existing solutions based on ad-hoc heuristics, Machine Learning (ML) has the potential to improve further the efficiency of resource management in large-scale systems. In this paper we,,,, will describe and discuss how ML could be used to understand automatically both workloads and environments, and to help to cope with scheduling-related challenges such as consolidating co-located workloads, handling resource requests, guaranteeing application's QoSs, and mitigating tailed stragglers. We will introduce a generalized ML-based solution to large-scale resource scheduling and demonstrate its effectiveness through a case study that deals with performance-centric node classification and straggler mitigation. We believe that an MLbased method will help to achieve architectural optimization and efficiency improvement.