Matrix-Based Representations and Gradient-Free Algorithms for Neural Network Training

Rozario, Turibius; Oveissi, Parham; Goel, Ankit

Matrix-Based Representations and Gradient-Free Algorithms for Neural Network Training

dc.contributor.author	Rozario, Turibius
dc.contributor.author	Oveissi, Parham
dc.contributor.author	Goel, Ankit
dc.date.accessioned	2024-09-24T08:59:33Z
dc.date.available	2024-09-24T08:59:33Z
dc.description	23rd Annual International Conference on Association for Machine Learning and Applications (AMLA), Miami, Florida, Dec. 18-20, 2024
dc.description.abstract	This paper presents a compact, matrix-based representation of neural networks. Although neural networks are often understood pictorially as interconnected neurons, they are fundamentally mathematical nonlinear functions constructed by composing several vector-valued functions. Using basic results from linear algebra, we represent neural networks as an alternating sequence of linear maps and scalar nonlinear functions, known as activation functions. The training of neural networks involves minimizing a cost function, which typically requires the computation of a gradient. By applying basic multivariable calculus, we show that the cost gradient is also a function composed of a sequence of linear maps and nonlinear functions. In addition to the analytical gradient computation, we explore two gradient-free training methods. We compare these three training methods in terms of convergence rate and prediction accuracy, demonstrating the potential advantages of gradient-free approaches.
dc.format.extent	8 pages
dc.genre	conference papers and proceedings
dc.genre	postprints
dc.identifier	doi:10.13016/m2kfsh-kugr
dc.identifier.uri	http://hdl.handle.net/11603/36341
dc.language.iso	en_US
dc.publisher	IEEE
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC Mechanical Engineering Department
dc.relation.ispartof	UMBC Student Collection
dc.relation.ispartof	UMBC Meyerhoff Scholars Program
dc.rights	This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
dc.title	Matrix-Based Representations and Gradient-Free Algorithms for Neural Network Training
dc.type	Text
dcterms.creator	https://orcid.org/0000-0001-9326-0319
dcterms.creator	https://orcid.org/0000-0002-4146-6275

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2024_09_ICMLA_GradientFree_NN_training_Parham_ICMLA.pdf
Size:: 573.07 KB
Format:: Adobe Portable Document Format

Download

Collections

UMBC Faculty Collection
UMBC Mechanical Engineering Department
UMBC Meyerhoff Scholars Program
UMBC Student Collection