Open|SpeedShop Ease of Use Performance Analysis for Heterogenious Processor Systems
Status: Completed
Start Date: 2015-05-29
End Date: 2017-05-28
Description: We propose building upon the modular extensible architecture and existing capabilities of Open|SpeedShop to provide seamless, integrated, heterogeneous processor performance analysis. The NVIDIA GPU and Intel Many Integrated Core (MIC) processors are increasingly important at high performance computing (HPC) laboratories within NASA for use on NASA's high-end computing (HEC) projects because of their ability to accelerate scientific application performance. In order to understand what impact these accelerators are having on performance, tools must succinctly present heterogeneous processor performance information. One of the key goals of this work is to develop and implement innovative methods for presenting the performance information extracted from applications running on both traditional CPU and GPU/MIC processors. Phase II research builds on the progress made in phase I and will include more robust and complete gathering performance information for Intel's MIC architecture. In phase I we built the infrastructure and successfully prototyped a version of Open|SpeedShop that gathered and displayed performance information for applications that ran in the non-offload Intel MIC programming model. For phase II, our research would focus on how to monitor the performance of applications that use Intel's offload programming model. We would also focus research into performance analysis for applications using OpenACC.
Benefits: NASA researcher Dr. Sharadchandr Gavali profiled the National Combustion Code (NCC) application with Open|SpeedShop. The parallel EEMD (PEEMD) is being used to analyze Hurricane Sandy (2012) for better understanding of the multiple scale processes that may have impacted Sandy's movement, intensification and formation. We have interacted with Samsung Cheung about the performance analysis of this NASA application of interest. The USM3D Navier-Stokes flow solver contributed heavily to the NASA Constellation Project (CxP) as a highly productive computational tool for generating the aerodynamic databases for the Ares I and V launch vehicles and Orion launch abort vehicle (LAV). We have interacted with Jahed Djomehri from NASA on questions about performance analysis for this application of interest at NASA. OVERFLOW was developed as part of a collaborative effort between NASA's Johnson Space Center and NASA Ames Research Center (ARC). The driving force behind this work was the need for evaluating the flow about the Space Shuttle launch vehicle. Originally developed by NASA's Pieter Buning, Dennis Jespersen and others, the code is an outgrowth of earlier codes F3D and ARC3D. We have interacted with Dennis Jespersen regarding performance analysis of the OVERFLOW application. Another NASA application of interest is Goddard Cumulus Ensemble (GCEM3D). We have exchanged emails with Daniel Kokron at NASA about performance analysis of that application.
High performance computing applications in fields such as Structural Mechanics, Computational Fluid Dynamics, Electronic Design Automation, Quantum Chemistry, Weather and Climate, Defense and Intelligence, and many other computational domains benefit from accelerators such as GPUs and Intel MIC. Companies and laboratories using these disciplines are all potential customers for the heterogeneous performance analysis capable/enabled Open|SpeedShop commercial product.
High performance computing applications in fields such as Structural Mechanics, Computational Fluid Dynamics, Electronic Design Automation, Quantum Chemistry, Weather and Climate, Defense and Intelligence, and many other computational domains benefit from accelerators such as GPUs and Intel MIC. Companies and laboratories using these disciplines are all potential customers for the heterogeneous performance analysis capable/enabled Open|SpeedShop commercial product.
Lead Organization: Argo Navis Technologies LLC