Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models

dc.contributor.authorTyagi, Nancy
dc.contributor.authorShiri, Aidin
dc.contributor.authorSarkar, Surjodeep
dc.contributor.authorUmrawal, Abhishek Kumar
dc.contributor.authorGaur, Manas
dc.date.accessioned2023-09-22T18:19:43Z
dc.date.available2023-09-22T18:19:43Z
dc.date.issued2023-08-23
dc.description10th Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2023); Arlington, Virginia; April 22, 2023en_US
dc.description.abstractFoundational Language Models (FLMs) have advanced natural language processing (NLP) research. Current researchers are developing larger FLMs (e.g., XLNet, T5) to enable contextualized language representation, classification, and generation. While developing larger FLMs has been of significant advantage, it is also a liability concerning hallucination and predictive uncertainty. Fundamentally, larger FLMs are built on the same foundations as smaller FLMs (e.g., BERT); hence, one must recognize the potential of smaller FLMs which can be realized through an ensemble. In the current research, we perform a reality check on FLMs and their ensemble on benchmark and real-world datasets. We hypothesize that the ensembling of FLMs can influence the individualistic attention of FLMs and unravel the strength of coordination and cooperation of different FLMs. We utilize BERT and define three other ensemble techniques: {Shallow, Semi, and Deep}, wherein the Deep-Ensemble introduces a knowledge-guided reinforcement learning approach. We discovered that the suggested Deep-Ensemble BERT outperforms its large variation i.e. BERTlarge, by a factor of many times using datasets that show the usefulness of NLP in sensitive fields, such as mental health.en_US
dc.description.urihttps://arxiv.org/abs/2308.12272en_US
dc.format.extent6 pagesen_US
dc.genreconference papers and proceedingsen_US
dc.genrepostprintsen_US
dc.identifierdoi:10.13016/m2uyka-aqjp
dc.identifier.urihttps://doi.org/10.48550/arXiv.2308.12272
dc.identifier.urihttp://hdl.handle.net/11603/29843
dc.language.isoen_USen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.en_US
dc.titleSimple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Modelsen_US
dc.typeTexten_US
dcterms.creatorhttps://orcid.org/0009-0006-9813-9351en_US
dcterms.creatorhttps://orcid.org/0000-0001-5402-0988en_US
dcterms.creatorhttps://orcid.org/0000-0002-0147-2777en_US
dcterms.creatorhttps://orcid.org/0000-0003-4460-7499en_US
dcterms.creatorhttps://orcid.org/0000-0002-5411-2230en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2308.12272.pdf
Size:
402.68 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: