Embedded Accelerators for Scientific High-Performance Computing: An Energy Study of OpenCL Gaussian Elimination Workloads
Title | Embedded Accelerators for Scientific High-Performance Computing: An Energy Study of OpenCL Gaussian Elimination Workloads |
Publication Type | Conference Paper |
Year of Publication | 2017 |
Authors | Johnston, B., Lee, B., Angove, L., Rendell, A. |
Conference Name | 2017 46th International Conference on Parallel Processing Workshops (ICPPW) |
Keywords | accelerators, ARM processors, Benchmark testing, compositionality, CPU, embedded accelerators, Embedded systems, Energy efficiency, Gaussian processes, GPU, graphics processing units, Hardware, high-performance computing, high-performance supercomputers, HPC, low power devices, Matrices, Metrics, microprocessor chips, OpenCL Gaussian elimination workloads, parallel processing, performance evaluation, power aware computing, pubcrawl, Resiliency, Scientific Computing Security, Supercomputers |
Abstract | Energy efficient High-Performance Computing (HPC) is becoming increasingly important. Recent ventures into this space have introduced an unlikely candidate to achieve exascale scientific computing hardware with a small energy footprint. ARM processors and embedded GPU accelerators originally developed for energy efficiency in mobile devices, where battery life is critical, are being repurposed and deployed in the next generation of supercomputers. Unfortunately, the performance of executing scientific workloads on many of these devices is largely unknown, yet the bulk of computation required in high-performance supercomputers is scientific. We present an analysis of one such scientific code, in the form of Gaussian Elimination, and evaluate both execution time and energy used on a range of embedded accelerator SoCs. These include three ARM CPUs and two mobile GPUs. Understanding how these low power devices perform on scientific workloads will be critical in the selection of appropriate hardware for these supercomputers, for how can we estimate the performance of tens of thousands of these chips if the performance of one is largely unknown? |
URL | http://ieeexplore.ieee.org/document/8026070/ |
DOI | 10.1109/ICPPW.2017.22 |
Citation Key | johnston_embedded_2017 |
- high-performance supercomputers
- ARM processors
- Benchmark testing
- Supercomputers
- Compositionality
- CPU
- embedded accelerators
- embedded systems
- Energy Efficiency
- Gaussian processes
- gpu
- graphics processing units
- Hardware
- high-performance computing
- accelerators
- HPC
- low power devices
- Matrices
- Metrics
- microprocessor chips
- OpenCL Gaussian elimination workloads
- parallel processing
- performance evaluation
- power aware computing
- pubcrawl
- Resiliency
- Scientific Computing Security