Compression of deep neural networks

This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Distribution Rights granted to UMBC by the author.

Subjects

ASNI
Compressed quantized initialization
Compression of Deep neural network
Deep learning
Pruning

Abstract

Compressing of deep neural networks aims at finding a sparse network that performs as well as a dense network but with significantly less parameters. This compression is called pruning. As a result of pruning, the energy consumption reduces, hardware requirements are relaxed, and responses to queries become faster. The pruning problem yields a constrained, stochastic, nonconvex, and non-differentiable optimization problem with a very large size. All these barriers can be bypassed by solving an approximate problem. To do so, we present ÒAmenable Sparse Network InvestigatorÓ (ASNI) algorithm that utilizes a novel pruning strategy based on a sigmoid function that induces sparsity level globally over the course of one single round of training. The ASNI algorithm fulfills both tasks that current state-of-the-art strategies can only do one of them. This algorithm has two subalgorithms: 1) ASNI-I, 2) ASNI-II. The first subalgorithmlearns an accurate sparse off-the-shelf network only in one single round of training. ASNI-II learns a sparse network and an initialization that is quantized, compressed, and from which the sparse network is trainable.

Compression of deep neural networks

Files

Links to Files

Permanent Link

Collections

Author/Creator

Author/Creator ORCID

Date

Type of Work

Department

Program

Citation of Original Publication

Rights

Subjects

Abstract