Matrix-Based Representations and Gradient-Free Algorithms for Neural Network Training

dc.contributor.authorRozario, Turibius
dc.contributor.authorOveissi, Parham
dc.contributor.authorGoel, Ankit
dc.date.accessioned2024-09-24T08:59:33Z
dc.date.available2024-09-24T08:59:33Z
dc.description23rd Annual International Conference on Association for Machine Learning and Applications (AMLA), Miami, Florida, Dec. 18-20, 2024
dc.description.abstractThis paper presents a compact, matrix-based representation of neural networks. Although neural networks are often understood pictorially as interconnected neurons, they are fundamentally mathematical nonlinear functions constructed by composing several vector-valued functions. Using basic results from linear algebra, we represent neural networks as an alternating sequence of linear maps and scalar nonlinear functions, known as activation functions. The training of neural networks involves minimizing a cost function, which typically requires the computation of a gradient. By applying basic multivariable calculus, we show that the cost gradient is also a function composed of a sequence of linear maps and nonlinear functions. In addition to the analytical gradient computation, we explore two gradient-free training methods. We compare these three training methods in terms of convergence rate and prediction accuracy, demonstrating the potential advantages of gradient-free approaches.
dc.format.extent8 pages
dc.genreconference papers and proceedings
dc.genrepostprints
dc.identifierdoi:10.13016/m2kfsh-kugr
dc.identifier.urihttp://hdl.handle.net/11603/36341
dc.language.isoen_US
dc.publisherIEEE
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Mechanical Engineering Department
dc.relation.ispartofUMBC Student Collection
dc.relation.ispartofUMBC Meyerhoff Scholars Program
dc.rightsThis work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
dc.titleMatrix-Based Representations and Gradient-Free Algorithms for Neural Network Training
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0001-9326-0319
dcterms.creatorhttps://orcid.org/0000-0002-4146-6275

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2024_09_ICMLA_GradientFree_NN_training_Parham_ICMLA.pdf
Size:
573.07 KB
Format:
Adobe Portable Document Format