Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models

dc.contributor.authorTyagi, Nancy
dc.contributor.authorShiri, Aidin
dc.contributor.authorSarkar, Surjodeep
dc.contributor.authorUmrawal, Abhishek Kumar
dc.contributor.authorGaur, Manas
dc.date.accessioned2023-09-22T18:19:43Z
dc.date.available2023-09-22T18:19:43Z
dc.date.issued2023-08-23
dc.description10th Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2023); Arlington, Virginia; April 22, 2023en
dc.description.abstractFoundational Language Models (FLMs) have advanced natural language processing (NLP) research. Current researchers are developing larger FLMs (e.g., XLNet, T5) to enable contextualized language representation, classification, and generation. While developing larger FLMs has been of significant advantage, it is also a liability concerning hallucination and predictive uncertainty. Fundamentally, larger FLMs are built on the same foundations as smaller FLMs (e.g., BERT); hence, one must recognize the potential of smaller FLMs which can be realized through an ensemble. In the current research, we perform a reality check on FLMs and their ensemble on benchmark and real-world datasets. We hypothesize that the ensembling of FLMs can influence the individualistic attention of FLMs and unravel the strength of coordination and cooperation of different FLMs. We utilize BERT and define three other ensemble techniques: {Shallow, Semi, and Deep}, wherein the Deep-Ensemble introduces a knowledge-guided reinforcement learning approach. We discovered that the suggested Deep-Ensemble BERT outperforms its large variation i.e. BERTlarge, by a factor of many times using datasets that show the usefulness of NLP in sensitive fields, such as mental health.en
dc.description.urihttps://arxiv.org/abs/2308.12272en
dc.format.extent6 pagesen
dc.genreconference papers and proceedingsen
dc.genrepostprintsen
dc.identifierdoi:10.13016/m2uyka-aqjp
dc.identifier.urihttps://doi.org/10.48550/arXiv.2308.12272
dc.identifier.urihttp://hdl.handle.net/11603/29843
dc.language.isoenen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.en
dc.titleSimple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Modelsen
dc.typeTexten
dcterms.creatorhttps://orcid.org/0009-0006-9813-9351en
dcterms.creatorhttps://orcid.org/0000-0001-5402-0988en
dcterms.creatorhttps://orcid.org/0000-0002-0147-2777en
dcterms.creatorhttps://orcid.org/0000-0003-4460-7499en
dcterms.creatorhttps://orcid.org/0000-0002-5411-2230en

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2308.12272.pdf
Size:
402.68 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: