Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation

Submitted by aekwall on Fri, 04/28/2023 - 9:43am

Title	Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation
Publication Type	Conference Paper
Year of Publication	2022
Authors	Moses, William S., Narayanan, Sri Hari Krishna, Paehler, Ludger, Churavy, Valentin, Schanen, Michel, Hückelheim, Jan, Doerfert, Johannes, Hovland, Paul
Conference Name	SC22: International Conference for High Performance Computing, Networking, Storage and Analysis
Date Published	nov
Keywords	automatic differentiation, C++, C++ languages, codes, compiler, compiler security, compositionality, distributed, Enzyme, Enzymes, hybrid parallelization, Julia, LLVM, Metrics, MPI, OpenMP, parallel, parallel programming, Program processors, pubcrawl, Raja, Resiliency, Runtime, Scalability, Tasks
Abstract	Derivatives are key to numerous science, engineering, and machine learning applications. While existing tools generate derivatives of programs in a single language, modern parallel applications combine a set of frameworks and languages to leverage available performance and function in an evolving hardware landscape. We propose a scheme for differentiating arbitrary DAG-based parallelism that preserves scalability and efficiency, implemented into the LLVM-based Enzyme automatic differentiation framework. By integrating with a full-fledged compiler backend, Enzyme can differentiate numerous parallel frameworks and directly control code generation. Combined with its ability to differentiate any LLVM-based language, this flexibility permits Enzyme to leverage the compiler tool chain for parallel and differentiation-specitic optimizations. We differentiate nine distinct versions of the LULESH and miniBUDE applications, written in different programming languages (C++, Julia) and parallel frameworks (OpenMP, MPI, RAJA, Julia tasks, MPI.jl), demonstrating similar scalability to the original program. On benchmarks with 64 threads or nodes, we find a differentiation overhead of 3.4-6.8x on C++ and 5.4-12.5x on Julia.
DOI	10.1109/SC41404.2022.00065
Citation Key	moses_scalable_2022

Groups:

Science of Security VO