Dec 8 – 10, 2019
Monona Terrace Convention Center
America/Chicago timezone

Accelerated machine learning inference as a service for particle physics computing

Dec 8, 2019, 5:35 PM
25m
Hall of Ideas H (Monona Terrace Convention Center)

Hall of Ideas H

Monona Terrace Convention Center

Madison, Wisconsin

Speaker

Nhan Tran (Fermilab)

Description

Large-scale particle physics experiments face challenging demands for high-throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as GPUs and Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that require minimal modification to the current computing model. As examples, we use the ResNet50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC. We use heterogeneous hardware on-premises and in the cloud, including GPUs on Google Cloud and AWS and FPGAs on Microsoft Azure, to accelerate inference over CPUs in our current experimental infrastructure by more than an order of magnitude. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.

Primary author

Nhan Tran (Fermilab)

Presentation materials