Training Back Propagation Neural Networks in MapReduce on High-Dimensional Big Datasets With Global Evolution

Chen, Wanghu; Li, Jing; Li, Xintian; Zhang, Lizhi; Wang, Jianwu

Training Back Propagation Neural Networks in MapReduce on High-Dimensional Big Datasets With Global Evolution

dc.contributor.author	Chen, Wanghu
dc.contributor.author	Li, Jing
dc.contributor.author	Li, Xintian
dc.contributor.author	Zhang, Lizhi
dc.contributor.author	Wang, Jianwu
dc.date.accessioned	2024-02-12T22:02:34Z
dc.date.available	2024-02-12T22:02:34Z
dc.date.issued	2019-11-04
dc.description.abstract	Owing to its scalability and high fault-tolerance even on a distributed environment built up with personal computers, MapReduce has been introduced to parallelise the training of Back Propagation Neural Networks (BPNNs) on high-dimensional big datasets. Based on the evolution of local BPNNs produced by distributed Map tasks with different data splits, the paper proposes a novel approach to the distributed data-parallel training of BPNNs in MapReduce. The approach provides a reasonable measure to get global convergent BPNN candidates from local BPNNs only convergent on the specific data splits. Further, it not only can reduce the iterations to get the global convergent BPNN, but also shows great advantages in avoiding the training to get trapped into a local optimum on high-dimensional big datasets. To improve the training efficiency further, local BPNNs from the same computing node are merged based on the average of their weight matrices before they act as individuals of the population for the global evolution. Our approach also leverages Random Project based sampling techniques to evaluate the fitness of each individual in order to lower the computation cost in the evolution stage. Experiments show that our proposed approach improves the training efficiency highly compared to the stand-alone or traditional MapReduce BPNN training, and improves model accuracy for larger datasets. The comparison with 23 other popular classification approaches also shows that our proposed approach has big advantages in accuracy.
dc.description.sponsorship	This work was supported by the National Natural Science Foundation of China under Grant 61967013 and Grant 61462076.
dc.description.uri	https://ieeexplore.ieee.org/document/8890639
dc.format.extent	13 pages
dc.genre	journal articles
dc.identifier.citation	W. Chen, J. Li, X. Li, L. Zhang and J. Wang, "Training Back Propagation Neural Networks in MapReduce on High-Dimensional Big Datasets With Global Evolution," in IEEE Access, vol. 7, pp. 159855-159867, 2019, doi: 10.1109/ACCESS.2019.2951189.
dc.identifier.uri	https://doi.org/10.1109/ACCESS.2019.2951189
dc.identifier.uri	http://hdl.handle.net/11603/31606
dc.language.iso	en
dc.publisher	IEEE
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC Center for Accelerated Real Time Analysis
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department
dc.relation.ispartof	UMBC Data Science
dc.relation.ispartof	UMBC Joint Center for Earth Systems Technology (JCET)
dc.relation.ispartof	UMBC Center for Real-time Distributed Sensing and Autonomy
dc.relation.ispartof	UMBC Information Systems Department Collection
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rights	Attribution 4.0 International (CC BY 4.0 DEED)	en
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	UMBC Big Data Analytics Lab
dc.title	Training Back Propagation Neural Networks in MapReduce on High-Dimensional Big Datasets With Global Evolution
dc.type	Text
dcterms.creator	https://orcid.org/0000-0002-9933-1170

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Training_Back_Propagation_Neural_Networks_in_MapReduce_on_High-Dimensional_Big_Datasets_With_Global_Evolution.pdf
Size:: 5.28 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Information Systems Department
UMBC Center for Accelerated Real Time Analysis
UMBC Center for Real-time Distributed Sensing and Autonomy
UMBC Computer Science and Electrical Engineering Department
UMBC Data Science
UMBC Faculty Collection