Visible to the public Biblio

Filters: Keyword is datacenters  [Clear All Filters]
2022-08-26
Nougnanke, Kokouvi Benoit, Labit, Yann, Bruyere, Marc, Ferlin, Simone, Aïvodji, Ulrich.  2021.  Learning-based Incast Performance Inference in Software-Defined Data Centers. 2021 24th Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN). :118–125.
Incast traffic is a many-to-one communication pattern used in many applications, including distributed storage, web-search with partition/aggregation design pattern, and MapReduce, commonly in data centers. It is generally composed of short-lived flows that may be queued behind large flows' packets in congested switches where performance degradation is observed. Smart buffering at the switch level is sensed to mitigate this issue by automatically and dynamically adapting to traffic conditions changes in the highly dynamic data center environment. But for this dynamic and smart buffer management to become effectively beneficial for all the traffic, and especially for incast the most critical one, incast performance models that provide insights on how various factors affect it are needed. The literature lacks these types of models. The existing ones are analytical models, which are either tightly coupled with a particular protocol version or specific to certain empirical data. Motivated by this observation, we propose a machine-learning-based incast performance inference. With this prediction capability, smart buffering scheme or other QoS optimization algorithms could anticipate and efficiently optimize system parameters adjustment to achieve optimal performance. Since applying machine learning to networks managed in a distributed fashion is hard, the prediction mechanism will be deployed on an SDN control plane. We could then take advantage of SDN's centralized global view, its telemetry capabilities, and its management flexibility.
2020-12-01
Zhang, Y., Deng, L., Chen, M., Wang, P..  2018.  Joint Bidding and Geographical Load Balancing for Datacenters: Is Uncertainty a Blessing or a Curse? IEEE/ACM Transactions on Networking. 26:1049—1062.

We consider the scenario where a cloud service provider (CSP) operates multiple geo-distributed datacenters to provide Internet-scale service. Our objective is to minimize the total electricity and bandwidth cost by jointly optimizing electricity procurement from wholesale markets and geographical load balancing (GLB), i.e., dynamically routing workloads to locations with cheaper electricity. Under the ideal setting where exact values of market prices and workloads are given, this problem reduces to a simple linear programming and is easy to solve. However, under the realistic setting where only distributions of these variables are available, the problem unfolds into a non-convex infinite-dimensional one and is challenging to solve. One of our main contributions is to develop an algorithm that is proven to solve the challenging problem optimally, by exploring the full design space of strategic bidding. Trace-driven evaluations corroborate our theoretical results, demonstrate fast convergence of our algorithm, and show that it can reduce the cost for the CSP by up to 20% as compared with baseline alternatives. This paper highlights the intriguing role of uncertainty in workloads and market prices, measured by their variances. While uncertainty in workloads deteriorates the cost-saving performance of joint electricity procurement and GLB, counter-intuitively, uncertainty in market prices can be exploited to achieve a cost reduction even larger than the setting without price uncertainty.

2019-01-31
Larisch, James, Mickens, James, Kohler, Eddie.  2018.  Alto: Lightweight VMs Using Virtualization-Aware Managed Runtimes. Proceedings of the 15th International Conference on Managed Languages & Runtimes. :8:1–8:7.

Virtualization enables datacenter operators to safely run computations that belong to untrusted tenants. An ideal virtual machine has three properties: a small memory footprint; strong isolation from other VMs and the host OS; and the ability to maintain in-memory state across client requests. Unfortunately, modern virtualization technologies cannot provide all three properties at once. In this paper, we explain why, and propose a new virtualization approach, called Alto, that virtualizes at the layer of a managed runtime interface. Through careful design of (1) the application-facing managed interface and (2) the internal runtime architecture, Alto provides VMs that are small, secure, and stateful. Conveniently, Alto also simplifies VM operations like suspension, migration, and resumption. We provide several details about the proposed design, and discuss the remaining challenges that must be solved to fully realize the Alto vision.

2018-12-10
Versluis, L., Neacsu, M., Iosup, A..  2018.  A Trace-Based Performance Study of Autoscaling Workloads of Workflows in Datacenters. 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). :223–232.

To improve customer experience, datacenter operators offer support for simplifying application and resource management. For example, running workloads of workflows on behalf of customers is desirable, but requires increasingly more sophisticated autoscaling policies, that is, policies that dynamically provision resources for the customer. Although selecting and tuning autoscaling policies is a challenging task for datacenter operators, so far relatively few studies investigate the performance of autoscaling for workloads of workflows. Complementing previous knowledge, in this work we propose the first comprehensive performance study in the field. Using trace-based simulation, we compare state-of-the-art autoscaling policies across multiple application domains, workload arrival patterns (e.g., burstiness), and system utilization levels. We further investigate the interplay between autoscaling and regular allocation policies, and the complexity cost of autoscaling. Our quantitative study focuses not only on traditional performance metrics and on state-of-the-art elasticity metrics, but also on time-and memory-related autoscaling-complexity metrics. Our main results give strong and quantitative evidence about previously unreported operational behavior, for example, that autoscaling policies perform differently across application domains and allocation and provisioning policies should be co-designed.

2018-07-18
Thakre, P. P., Sahare, V. N..  2017.  VM live migration time reduction using NAS based algorithm during VM live migration. 2017 Third International Conference on Sensing, Signal Processing and Security (ICSSS). :242–246.

Live migration is the process used in virtualization environment of datacenters in order to take the benefit of zero downtime during system maintenance. But during migrating live virtual machines along with system files and storage data, network traffic gets increases across network bandwidth and delays in migration time. There is need to reduce the migration time in order to maintain the system performance by analyzing and optimizing the storage overheads which mainly creates due to unnecessary duplicated data transferred during live migration. So there is need of such storage device which will keep the duplicated data residing in both the source as well as target physical host i.e. NAS. The proposed hash map based algorithm maps all I/O operations in order to track the duplicated data by assigning hash value to both NAS and RAM data. Only the unique data then will be sent data to the target host without affecting service level agreement (SLA), without affecting VM migration time, application downtime, SLA violations, VM pre-migration and downtime post migration overheads during pre and post migration of virtual machines.