Dissecting On-Node Memory Access Performance: A Semantic Approach
Title | Dissecting On-Node Memory Access Performance: A Semantic Approach |
Publication Type | Conference Paper |
Year of Publication | 2014 |
Authors | Gimenez, A., Gamblin, T., Rountree, B., Bhatele, A., Jusufi, I., Bremer, P.-T., Hamann, B. |
Conference Name | High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for |
Date Published | Nov |
Keywords | attribute semantic information, code regions, Context, CPU manufacturers, data motion, data objects, design decisions, distributed memory systems, domain decomposition, fine-grained memory access performance data, Hardware, Kernel, Libraries, memory access optimization, memory behaviour, multi-threading, multithreading, on-node memory access performance, performance ramifications, PMU, power efficiency, Program processors, sampled memory accesses, sampling-based performance measurement units, semantic approach, Semantics, storage management, Topology |
Abstract | Optimizing memory access is critical for performance and power efficiency. CPU manufacturers have developed sampling-based performance measurement units (PMUs) that report precise costs of memory accesses at specific addresses. However, this data is too low-level to be meaningfully interpreted and contains an excessive amount of irrelevant or uninteresting information. We have developed a method to gather fine-grained memory access performance data for specific data objects and regions of code with low overhead and attribute semantic information to the sampled memory accesses. This information provides the context necessary to more effectively interpret the data. We have developed a tool that performs this sampling and attribution and used the tool to discover and diagnose performance problems in real-world applications. Our techniques provide useful insight into the memory behaviour of applications and allow programmers to understand the performance ramifications of key design decisions: domain decomposition, multi-threading, and data motion within distributed memory systems. |
URL | https://ieeexplore.ieee.org/document/7013001 |
DOI | 10.1109/SC.2014.19 |
Citation Key | 7013001 |
- memory behaviour
- Topology
- storage management
- Semantics
- semantic approach
- sampling-based performance measurement units
- sampled memory accesses
- Program processors
- power efficiency
- PMU
- performance ramifications
- on-node memory access performance
- multithreading
- multi-threading
- attribute semantic information
- memory access optimization
- Libraries
- Kernel
- Hardware
- fine-grained memory access performance data
- domain decomposition
- distributed memory systems
- design decisions
- data objects
- data motion
- CPU manufacturers
- Context
- code regions