Sign In

Communications of the ACM

ACM Careers

Preparing for The Exascale Era

View as: Print Mobile App Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook
Louise Huot, Jack Deslippe, William Huhn, and Nichols Romero at an Aurora workshop

Intel software engineer Louise Huot (left), NERSC application performance group lead Jack Deslippe (seated, center), and Argonne computational scientists William Huhn (standing) and Nichols Romero (seated, right) at an ALCF-hosted Aurora workshop in February 2020.

Credit: Argonne National Laboratory

From mapping the human brain to accelerating the discovery of new materials, the power of exascale computing promises to advance the frontiers of some of the world's most ambitious scientific endeavors.

But applying the immense processing power of the U.S. Department of Energy's upcoming exascale systems to such problems is no trivial task. Researchers from across the high-performance computing (HPC) community are working to develop software tools, codes, and methods that can fully exploit the innovative accelerator-based supercomputers scheduled to arrive at DOE national laboratories starting in 2021.

As the future home to the Aurora system being developed by Intel and Hewlett Packard Enterprise (HPE), DOE's Argonne National Laboratory has been ramping up efforts to ready the supercomputer and its future users for science in the exascale era. Argonne researchers are engaged in a broad range of preparatory activities, including exascale code development, technology evaluations, user training, and collaborations with vendors, fellow national laboratories, and DOE's Exascale Computing Project (ECP).

"Science on day one is our goal when it comes to standing up new supercomputers," says Michael Papka, director of the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science User Facility. "It's always challenging to prepare for a bleeding-edge machine that does not yet exist, but our team is leveraging all available tools and resources to make sure the research community can use Aurora effectively as soon as it's deployed for science."

The process of planning and preparing for new leadership-class supercomputers takes years of collaboration and coordination. "Our team has years of experience fielding these extreme-scale systems. It requires deep collaboration with vendors on the development of hardware, software, and storage technologies, as well as facility enhancements to ensure we have the infrastructure in place to power and cool these massive supercomputers," says Susan Coghlan, ALCF Project Director for Aurora.

The ALCF team has been partnering with the ECP on several efforts. These include developing a common continuous integration strategy to create an environment that enables regular software testing across DOE's exascale facilities; using the Spack package manager as a tool for build automation and final deployment of software; exploring and potentially enhancing the installation, upgrade, and monitoring capabilities of HPE's Shasta software stack; and working to enable container support and tools on Aurora and other exascale systems.

Working in concert with the ECP, Argonne researchers are also contributing to the advancement of programming models (OpenMP, SYCL, Kokkos, Raja), language standards ( C++), and compilers (Clang/LLVM) that are critical to developing efficient and portable exascale applications. Furthermore, the ALCF continues to work closely with Intel and HPE on the testing and development of various components to ensure that the scientific computing community can leverage them effectively.

"By analyzing the performance of key benchmarks and applications on early hardware, we are developing a broad understanding of the system's architecture and capabilities," says Kaylan Kumaran, ALCF Director of Technology. "This effort helps us to identify best practices for optimizing codes and, ultimately, create a roadmap for future users to adapt and tune their software for the new system."

From Argonne National Laboratory
View Full Article


No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account