CSCMAC - Cyclic Sparsely Connected Neural Manycore Accelerator

dc.contributor.advisorMohsenin, Tinoosh
dc.contributor.authorPaneliya, Hirenkumar Sumanbhai
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programEngineering, Computer
dc.date.accessioned2021-09-01T13:55:50Z
dc.date.available2021-09-01T13:55:50Z
dc.date.issued2020-01-01
dc.description.abstractIn deep neural networks (DNNs), model size and computation complexity are two important factors that impact memory footprint and performance respectively, both of which can be minimized by compressing DNN with methods such as pruning as well as structurally compressing the model. Recent works on DNN weight pruning have shown a significant reduction in model size but at the expense of irregularity in the DNN architecture, which necessitates additional indexing memory to address non-zero weights. Structurally compressing DNNs, on the other hand, require minimal or no indexing, and are on par with pruning methods in terms of the DNN accuracy, and can be used as an overlay for traditional DNN layers. The recent Cyclic sparsely connected (CSC) layers structurally compress and sparsify DNNs which can reduce the memory footprint of dense layers from O(N�) to O(N log N). In this theses, we propose an energy-efficient, domain-specific manycore accelerator named CSCMAC - Cyclic Sparsely Connected Neural Network Manycore Accelerator, which effectively maps and executes DNNs compressed with CSC architectures. We implement a kernel specific instruction for CSC layers for inference on a manycore platform, take advantage of their cyclic architecture, and show that their implementation in software even for a parallel-computing processor is affable. To further take advantage of their implementation simplicity, we propose customized instructions for the manycore that fuse frequently used sequences of machine codes and, by means of Amdahl's law, evaluate the optimization gained by the customization. Our experimental results using a LeNet300100 on MNIST (as an image classification application) and a Multi-Layer Perceptron (MLP) on Physical Activity Monitoring (as a physical activity monitoring data processing application) indicate that by replacing Fully-Connected (FC) layers with CSC layers, we can achieve 46x and 6x compression respectively within a margin of 2% accuracy loss. With only 2 mW power overhead, novel instruction CSC is added to the ISA of the CSCMAC which replaces frequently used functions that would have taken 11 clock cycles with 1 clock cycles. A 64-cluster architecture of the CSCMAC is fully placed and routed using 65nm, TSMC CMOS technology. The layout of each cluster occupies an area of 0.73 mm� and consumes 230.2 mW power at 980 MHz clock frequency. Our proposed CSCMAC achieves 57% higher throughput and 56% lower energy compared to its equivalent predecessor manycore (PENC). Also, the CSCMAC achieves 90x higher throughput and consumes 69x lower energy compared to CPU implementation of the NVIDIA Jetson TX2 platform.
dc.formatapplication:pdf
dc.genretheses
dc.identifierdoi:10.13016/m21bb5-0wv9
dc.identifier.other12174
dc.identifier.urihttp://hdl.handle.net/11603/22904
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.sourceOriginal File Name: Paneliya_umbc_0434M_12174.pdf
dc.subjectASIC
dc.subjectCSC
dc.subjectCSCMAC
dc.subjectManycore
dc.subjectPruning
dc.subjectVLSI
dc.titleCSCMAC - Cyclic Sparsely Connected Neural Manycore Accelerator
dc.typeText
dcterms.accessRightsAccess limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
dcterms.accessRightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Paneliya_umbc_0434M_12174.pdf
Size:
29.65 MB
Format:
Adobe Portable Document Format