Jennifer Ngadiuba

Fermi National Accelerator Laboratory

[intermediate] Ultra Low-latency and Low-area Machine Learning Inference at the Edge

Summary

With edge computing, real-time inference of deep neural networks (DNNs) on custom hardware has become increasingly relevant. Smartphone companies are incorporating Artificial Intelligence (AI) chips in their design for on-device inference to improve user experience and tighten data security, and the autonomous vehicle industry is turning to application-specific integrated circuits (ASICs) to keep the latency low. While the typical acceptable latency for real-time inference in applications like those above is O(1) ms, other applications require sub-microsecond inference. For instance, high-frequency trading machine learning (ML) algorithms are running on field-programmable gate arrays (FPGAs), highly accurate devices, to make decisions within nanoseconds. At the extreme inference spectrum end of both the low-latency (as in high-frequency trading) and limited-area (as in smartphone applications) is the processing of data from proton-proton collisions at the Large Hadron Collider (LHC) at CERN. Here, latencies of O(1) microsecond is required and resources are strictly limited. In this lecture I will discuss how ML in FPGAs can improve the event selection process in particle detectors at the LHC, discuss and demonstrate how to reduce the memory footprint of ML models using state-of-the art techniques such as model pruning and quantization, and demonstrate how to design and deploy a fast deep neural network on a FPGA using the hls4ml library. The classes will feature both a theoretical and a hands-on practical session.

Syllabus

Introduction to real-time AI at the edge for particle physics
Introduction to Field-Programmable Gate Arrays
Neural network pruning and quantization-aware training
Translation of NN inference to FPGA firmware and its synthesis with the hls4ml tool

References

[1] Duarte, J., Han, S., Harris, P., Jindariani, S., Kreinar, E., Kreis, B., Ngadiuba, J., Pierini, M., Rivera, R., Tran, N., & et al. (2018). “Fast inference of deep neural networks in FPGAs for particle physics,” JINST, vol. 13, no. 07, P07027, 2018. doi:10.1088/1748-0221/13/07/P07027. arXiv:1804.06913

[2] Coelho, C. N., Kuusela, A., Li, S., Zhuang, H., Ngadiuba, J., Aarrestad, T. K., Loncar, V., Pierini, M., Pol, A. A., & Summers, S. (2021). “Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors,” Nature Machine Intelligence, vol. 3, Jun. 2020. doi:10.1038/s42256-021-00356-5. arXiv:2006.10159

[3] Aarrestad, T., Loncar, V., Ghielmetti, N., Pierini, M., Summers, S., Ngadiuba, J., Petersson, C., Linander, H., Iiyama, Y., Di Guglielmo, G., & et al. (2021). “Fast convolutional neural networks on FPGAs with hls4ml,” Mach. Learn. Sci. Tech., vol. 2, no. 4, p. 045015, 2021. doi:10.1088/2632-2153/ac0ea1. arXiv:2101.05108

[4] Guglielmo, G. D., Duarte, J. M., Harris, P. C., Hoang, D., Jindariani, S., Kreinar, E., Liu, M., Loncar, V., Ngadiuba, J., Pedro, K., Pierini, M., Rankin, D. S., Sagear, S., Summers, S., Tran, N., & Wu, Z. (2021). “Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML,” Mach. Learn. Sci. Tech., vol. 2, p. 015001, 2021. doi:10.1088/2632-2153/aba042. arXiv: 2003.06308

Pre-requisites

Basic knowledge of deep learning and data science methods and tools.

Short bio

Dr. Jennifer Ngadiuba is Associate Scientist with Wilson Fellowship at the Fermi National Accelerator Laboratory, the leading facility for particle physics research in the United States. She is specialized in the application of machine learning to particle physics towards more intelligent detector systems, data reduction and data analysis strategies for an efficient extraction of the most fundamental physics information from the multitude of data collected at the Large Hadron Collider (LHC), the world’s highest-energy particle physics experiment located at the CERN laboratory (Switzerland-France). She co-founded the fast machine learning organization, a research collective of physicists, engineers, and computer scientists interested in deploying machine learning algorithms for unique and challenging scientific applications. Dr. Ngadiuba is also applying modern anomaly detection techniques to advance the quest for elusive new physics at the LHC.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.