Graph Neural Network-based Track finding as a Service with ACTS
ORAL
Abstract
Recent progress in track finding for the High-Luminosity Large Hadron Collider (HL-LHC) has demonstrated the effectiveness of Graph Neural Network (GNN)-based algorithms. While these algorithms offer high efficiency and reasonable resolutions, their computational demands on CPUs hinder real-time processing, requiring accelerators like GPUs. However, the large size of the involved graphs poses a challenge for facilities lacking high-end GPUs.
To address this, we propose deploying the GNN-based track-finding algorithm as a service in the cloud or high-performance computing centers such as the NERSC Perlmutter system with over 7000 A100 GPUs. We have implemented a tracking-as-a-service prototype within A Common Tracking Software (ACTS), a toolkit for charged particle track reconstruction.
This approach is algorithm-agnostic, allowing the incorporation of various algorithms as new backends through interactions with the client interface in ACTS. In this contribution, we showcase the versatility of the as-a-service approach by implementing the GNN-based track-finding workflow using the Nvidia Triton Inference Server within ACTS. We assess track-finding throughput and GPU utilization, exploring the scalability of the inference server across the NERSC Perlmutter supercomputer and cloud resources.
To address this, we propose deploying the GNN-based track-finding algorithm as a service in the cloud or high-performance computing centers such as the NERSC Perlmutter system with over 7000 A100 GPUs. We have implemented a tracking-as-a-service prototype within A Common Tracking Software (ACTS), a toolkit for charged particle track reconstruction.
This approach is algorithm-agnostic, allowing the incorporation of various algorithms as new backends through interactions with the client interface in ACTS. In this contribution, we showcase the versatility of the as-a-service approach by implementing the GNN-based track-finding workflow using the Nvidia Triton Inference Server within ACTS. We assess track-finding throughput and GPU utilization, exploring the scalability of the inference server across the NERSC Perlmutter supercomputer and cloud resources.
–
Presenters
-
Haoran Zhao
University of Washington
Authors
-
Yuan-Tang Chou
University of Washington
-
Haoran Zhao
University of Washington
-
Xiangyang Ju
Lawrence Berkeley National Laboratory
-
Shih-Chieh Hsu
University of Washington
-
Paolo Calafiura
Lawrence Berkeley National Laboratory
-
Philip C Harris
Massachusetts Institute of Technology
-
Patrick McCormack
Massachusetts Institute of Technology
-
Yao Yao
Purdue University
-
Yongbin Feng
Fermi National Accelerator Laboratory
-
Elham E Khoda
University of Washington
-
Kevin J Pedro
Fermi National Accelerator Laboratory
-
Dylan S Rankin
University of Pennsylvania
-
Andrew Naylor
Lawrence Berkeley National Laboratory