APS Logo

Scaling graph models for large computational catalysis datasets to thousands of GPUs on Perlmutter

ORAL · Invited

Abstract

The workhouse of computational catalysis, density functional theory, remains the rate-limiting step in complex and high-fidelity investigations of experimentally-relevant catalyst behavior. Machine learning approaches to accelerate these simulations have seen much progress in the last few years. Large computational catalyst datasets, such as the Open Catalyst 2020 (OC20) dataset, have drastically improved the generalizability of machine learning surrogate models. Most state of the art models on the OC20 dataset are variations of graph neural networks trained on 10s to 100s of GPUs. As one of the NERSC early science application projects (NESAP) for the Perlmutter supercomputer, we are working to scale these models to 1000s of GPUs to identify the scaling behavior of accuracy with model size, and will present scaling results collected during the NESAP project. I will also discuss challenges in scaling these models and research directions that would be valuable for using these large models in day-to-day science efforts.

Presenters

  • Zachary Ulissi

    Carnegie Mellon University

Authors

  • Zachary Ulissi

    Carnegie Mellon University

  • Brandon Wood

    Lawrence Berkeley National Laboratory

  • Steven Farrell

    Lawrence Berkeley National Lab

  • Joshua Romero

    NVIDIA

  • Thorsten Kurth

    NVIDIA