Biblio
Filters: Author is Grant, Ryan E. [Clear All Filters]
MiniMod: A Modular Miniapplication Benchmarking Framework for HPC. 2021 IEEE International Conference on Cluster Computing (CLUSTER). :12–22.
.
2021. The HPC application community has proposed many new application communication structures, middleware interfaces, and communication models to improve HPC application performance. Modifying proxy applications is the standard practice for the evaluation of these novel methodologies. Currently, this requires the creation of a new version of the proxy application for each combination of the approach being tested. In this article, we present a modular proxy-application framework, MiniMod, that enables evaluation of a combination of independently written computation kernels, data transfer logic, communication access, and threading libraries. MiniMod is designed to allow rapid development of individual modules which can be combined at runtime. Through MiniMod, developers only need a single implementation to evaluate application impact under a variety of scenarios.We demonstrate the flexibility of MiniMod’s design by using it to implement versions of a heat diffusion kernel and the miniFE finite element proxy application, along with a variety of communication, granularity, and threading modules. We examine how changing communication libraries, communication granularities, and threading approaches impact these applications on an HPC system. These experiments demonstrate that MiniMod can rapidly improve the ability to assess new middleware techniques for scientific computing applications and next-generation hardware platforms.
MPI Sessions: Leveraging Runtime Infrastructure to Increase Scalability of Applications at Exascale. Proceedings of the 23rd European MPI Users' Group Meeting. :121–129.
.
2016. MPI includes all processes in MPI\_COMM\_WORLD; this is untenable for reasons of scale, resiliency, and overhead. This paper offers a new approach, extending MPI with a new concept called Sessions, which makes two key contributions: a tighter integration with the underlying runtime system; and a scalable route to communication groups. This is a fundamental change in how we organise and address MPI processes that removes well-known scalability barriers by no longer requiring the global communicator MPI\_COMM\_WORLD.