A Fast Method to Fine-Tune Neural Networks for the Least Energy Consumption on FPGAs
Loading...
Permanent Link
Author/Creator ORCID
Date
2021
Type of Work
Department
Program
Citation of Original Publication
Hosseini, Morteza et al.; A Fast Method to Fine-Tune Neural Networks for the Least Energy Consumption on FPGAs; HAET workshop of ICLR 2021; https://eehpc.csee.umbc.edu/publications/pdf/2021/A_fast_method.pdf
Rights
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Subjects
Abstract
Because of their simple hardware requirements, low bitwidth neural networks
(NNs) have gained significant attention over the recent years, and have been extensively employed in electronic devices that seek efficiency and performance.
Research has shown that scaled-up low bitwidth NNs can have accuracy levels on
par with their full-precision counterparts. As a result, there seems to be a tradeoff between quantization (q) and scaling (s) of NNs to maintain the accuracy. In
this paper, we propose QS-NAS which is a systematic approach to explore the
best quantization and scaling factors for a NN architecture that satisfies a targeted
accuracy level and results in the least energy consumption per inference when deployed to a hardware–FPGA in this work. Compared to the literature using the
same VGG-like NN with different q and s over the same datasets, our selected optimal NNs deployed to a low-cost tiny Xilinx FPGA from the ZedBoard resulted
in accuracy levels higher or on par with those of the related work, while giving the
least power dissipation and the highest inference/Joule.