Optimizing Quantized Convolutional Neural Networks on FPGAs

Tai Nguyen; Javier M Duarte

Optimizing Quantized Convolutional Neural Networks on FPGAs

ORAL

Abstract

The goal of this project is to improve the compatibility between hls4ml and FINN, the two frameworks used to deploy ML on FPGAs. One of the options is to work on creating a common intermediate representation (IR) using ONNX for reduced precision, or quantized, neural networks (QNNs) for use in FPGAs.[2] This would provide the perfect opportunity to merge the practical aspect of writing out ONNX code on hls4ml and the application of implementing that code using FINN. So when a deep neural network is implemented, it can be written as a mix of hls4ml code and FINN code which will come together to make a graphic design through Vivado IP integrator. Essentially having these two languages work together offer a bridge if one were to switch from programming from hls4ml to FINN or vice versa. This would also allow greater flexibility within each language as they both receive a shared library of trained QNNs. Newer users would then find themselves having an easier time getting introduced into translating code into FPGA firmware and being able to easily switch between hls4ml and FINN libraries.

April 9, 2022, 5:21 PM – April 9, 2022, 5:33 PM

Presenters

Tai Nguyen

University of California, San Diego

Authors

Tai Nguyen

University of California, San Diego
Javier M Duarte

University of California, San Diego