Tracking as-a-service Tutorial at US-ATLAS Conference

By: Miles Cochran-Branson
July 26, 2024

Following the three-day US-ATLAS conference held at the University of Washington (UW) Seattle campus, students met for a brief tutorial on tools available for US-ATLAS members. After introductions in computing resources, Yuan-Tang Chou—a postdoc at UW and member of the A3D3 team—gave a presentation on GPU resources and utilization with a focus on new applications to particle-physics workflows. 

Chou gives a brief presentation on applications of the NVIDIA Triton server for deploying models as-a-service.

The presentation focused on deploying models on accelerators such as GPUs, as-a-service (aaS) using the NVIDIA Triton Inference Server. Chou discussed the merits of heterogeneous computing—the most straightforward way to deploy algorithms where the CPU and GPU both are connected on a single node. He noted that for many physics processes such as Graph-Neural-Network (GNN) based tracking, flavor tagging, and detector simulation, heterogeneous computing could be “inefficient and very expensive to scale.” Hence, offloading expensive tasks to a GPU server could streamline the deployment of large models important for ATLAS physics. 

Sample architecture of as-a-service model deployment from CPU-only client nodes or CPU / GPU client nodes to a GPU server.

After motivating why deploying models as-a-service could be beneficial in physics analysis, Chou gave a brief demo on deploying GNN-tracking aaS. This was followed by a hands-on tutorial deploying the resnet50 image recognition deep neural network as-a-service on computing resources at CERN. The tutorial material focused on building the proper model repository structure and configuration for image detection on a GPU server. Students set-up a work environment, deployed a backend on the server, and sent an image to the server to be classified. 

Students work on deploying a backend on a server and sending images to the server to be classified. 

By the end of the tutorial, students had successfully deployed a backend and received image classifications back from the image recognition model. Interested students were connected with experts currently working in ongoing development of aaS tools in algorithm development of ATLAS algorithms.

Tutorial Resources: https://hrzhao76.github.io/AthenaTriton/triton-lxplusGPU.htmlDeveloped by Yuan-Tang Chou, Miles Cochran-Branson (UW), Xiangyang Ju (LBNL), and Haoran Zhao (UW)

Written by Miles Cochran-Branson, PhD student at University of Washington