Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models

Tyagi, Nancy; Shiri, Aidin; Sarkar, Surjodeep; Umrawal, Abhishek Kumar; Gaur, Manas

Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models

dc.contributor.author	Tyagi, Nancy
dc.contributor.author	Shiri, Aidin
dc.contributor.author	Sarkar, Surjodeep
dc.contributor.author	Umrawal, Abhishek Kumar
dc.contributor.author	Gaur, Manas
dc.date.accessioned	2023-09-22T18:19:43Z
dc.date.available	2023-09-22T18:19:43Z
dc.date.issued	2023-08-23
dc.description	10th Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2023); Arlington, Virginia; April 22, 2023	en_US
dc.description.abstract	Foundational Language Models (FLMs) have advanced natural language processing (NLP) research. Current researchers are developing larger FLMs (e.g., XLNet, T5) to enable contextualized language representation, classification, and generation. While developing larger FLMs has been of significant advantage, it is also a liability concerning hallucination and predictive uncertainty. Fundamentally, larger FLMs are built on the same foundations as smaller FLMs (e.g., BERT); hence, one must recognize the potential of smaller FLMs which can be realized through an ensemble. In the current research, we perform a reality check on FLMs and their ensemble on benchmark and real-world datasets. We hypothesize that the ensembling of FLMs can influence the individualistic attention of FLMs and unravel the strength of coordination and cooperation of different FLMs. We utilize BERT and define three other ensemble techniques: {Shallow, Semi, and Deep}, wherein the Deep-Ensemble introduces a knowledge-guided reinforcement learning approach. We discovered that the suggested Deep-Ensemble BERT outperforms its large variation i.e. BERTlarge, by a factor of many times using datasets that show the usefulness of NLP in sensitive fields, such as mental health.	en_US
dc.description.uri	https://arxiv.org/abs/2308.12272	en_US
dc.format.extent	6 pages	en_US
dc.genre	conference papers and proceedings	en_US
dc.genre	postprints	en_US
dc.identifier	doi:10.13016/m2uyka-aqjp
dc.identifier.uri	https://doi.org/10.48550/arXiv.2308.12272
dc.identifier.uri	http://hdl.handle.net/11603/29843
dc.language.iso	en_US	en_US
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC Student Collection
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.	en_US
dc.title	Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models	en_US
dc.type	Text	en_US
dcterms.creator	https://orcid.org/0009-0006-9813-9351	en_US
dcterms.creator	https://orcid.org/0000-0001-5402-0988	en_US
dcterms.creator	https://orcid.org/0000-0002-0147-2777	en_US
dcterms.creator	https://orcid.org/0000-0003-4460-7499	en_US
dcterms.creator	https://orcid.org/0000-0002-5411-2230	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2308.12272.pdf
Size:: 402.68 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
UMBC Student Collection