FatTreeSim: Modeling a Large-scale Fat-Tree Network for HPC Systems and Data Centers Using Parallel and Discrete Even Simulation
Title | FatTreeSim: Modeling a Large-scale Fat-Tree Network for HPC Systems and Data Centers Using Parallel and Discrete Even Simulation |
Publication Type | Conference Paper |
Year of Publication | 2015 |
Authors | Ning Liu, Illinois Institute of Technology, Adnan Haider, Illinois Institute of Technology, Xian-He Sun, Illinois Institute of Technology, Dong Jin, Illinois Institute of Technology |
Conference Name | ACM SIGSIM Conference on Principles of Advanced Discrete Simulation |
Date Published | 06/2015 |
Publisher | ACM |
Conference Location | London, UK |
Keywords | A Hypothesis Testing Framework for Network Security, Blue Gene/Q, Datacenter interconnection network, Fat-tree networks, NSA SoS Lablets Materials, Parallel discrete event simulation, science of security, simulation, Supercomputer interconnection networks, UIUC |
Abstract | Fat-tree topologies have been widely adopted as the communication network in data centers in the past decade. Nowa- days, high-performance computing (HPC) system designers are considering using fat-tree as the interconnection network for the next generation supercomputers. For extreme-scale computing systems like the data centers and supercomput- ers, the performance is highly dependent on the intercon- nection networks. In this paper, we present FatTreeSim, a PDES-based toolkit consisting of a highly scalable fat-tree network model, with the goal of better understanding the de- sign constraints of fat-tree networking architectures in data centers and HPC systems, as well as evaluating the applica- tions running on top of the network. FatTreeSim is designed to model and simulate large-scale fat-tree networks up to millions of nodes with protocol-level fidelity. We have con- ducted extensive experiments to validate and demonstrate the accuracy, scalability and usability of FatTreeSim. On Argonne Leadership Computing Facility's Blue Gene/Q sys- tem, Mira, FatTreeSim is capable of achieving a peak event rate of 305 M/s for a 524,288-node fat-tree model with a total of 567 billion committed events. The strong scaling experiments use up to 32,768 cores and show a near linear scalability. Comparing with a small-scale physical system in Emulab, FatTreeSim can accurately model the latency in the same fat-tree network with less than 10% error rate for most cases. Finally, we demonstrate FatTreeSim's usability through a case study in which FatTreeSim serves as the net- work module of the YARNsim system, and the error rates for all test cases are less than 13.7%. |
Notes | Best Paper Award |
URL | http://publish.illinois.edu/science-of-security-lablet/files/2014/05/FatTreeSim-Modeling-Large-scale... |
Citation Key | node-23231 |
Attachment | Size |
---|---|
bytes |
- Science of Security
- Simulation
- Blue Gene/Q
- Datacenter interconnection network
- Fat-tree networks
- NSA SoS Lablets Materials
- Parallel discrete event simulation
- Science of Security
- simulation
- Supercomputer interconnection networks
- UIUC
- UIUC
- NSA SoS Lablets Materials
- A Hypothesis Testing Framework for Network Security