Unboxing Occupational Bias: Debiasing LLMs with U.S. Labor Data

Gorti, Atmika; Gaur, Manas; Chadha, Aman

Unboxing Occupational Bias: Debiasing LLMs with U.S. Labor Data

dc.contributor.author	Gorti, Atmika
dc.contributor.author	Gaur, Manas
dc.contributor.author	Chadha, Aman
dc.date.accessioned	2024-09-24T08:59:44Z
dc.date.available	2024-09-24T08:59:44Z
dc.date.issued	2024-11-08
dc.description	The Association for the Advancement of Artificial Intelligence’s 2024 Fall Symposium Series, November 7-9, 2024, Westin Arlington Gateway, Arlington, Virginia
dc.description.abstract	Large Language Models (LLMs) are prone to inheriting and amplifying societal biases embedded within their training data, potentially reinforcing harmful stereotypes related to gender, occupation, and other sensitive categories. This issue becomes particularly problematic as biased LLMs can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities across various domains, such as recruitment, online content moderation, or even the criminal justice system. Although prior research has focused on detecting bias in LLMs using specialized datasets designed to highlight intrinsic biases, there has been a notable lack of investigation into how these findings correlate with authoritative datasets, such as those from the U.S. National Bureau of Labor Statistics (NBLS). To address this gap, we conduct empirical research that evaluates LLMs in a ``bias-out-of-the-box" setting, analyzing how the generated outputs compare with the distributions found in NBLS data. Furthermore, we propose a straightforward yet effective debiasing mechanism that directly incorporates NBLS instances to mitigate bias within LLMs. Our study spans seven different LLMs, including instructable, base, and mixture-of-expert models, and reveals significant levels of bias that are often overlooked by existing bias detection techniques. Importantly, our debiasing method, which does not rely on external datasets, demonstrates a substantial reduction in bias scores, highlighting the efficacy of our approach in creating fairer and more reliable LLMs.
dc.description.uri	https://ojs.aaai.org/index.php/AAAI-SS/article/view/31770
dc.format.extent	8 pages
dc.genre	conference papers and proceedings
dc.genre	preprints
dc.identifier	doi:10.13016/m2jh8g-iodn
dc.identifier.citation	Gorti, Atmika, Aman Chadha, and Manas Gaur. “Unboxing Occupational Bias: Debiasing LLMs with U.S. Labor Data.” Proceedings of the AAAI Symposium Series 4, no. 1 (November 8, 2024): 48–55. https://doi.org/10.1609/aaaiss.v4i1.31770.
dc.identifier.uri	https://doi.org/10.1609/aaaiss.v4i1.31770
dc.identifier.uri	http://hdl.handle.net/11603/36358
dc.language.iso	en_US
dc.publisher	AAAI
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC Student Collection
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subject	Computer Science - Computation and Language
dc.subject	UMBC Ebiquity Research Group
dc.title	Unboxing Occupational Bias: Debiasing LLMs with U.S. Labor Data
dc.title.alternative	Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data
dc.type	Text
dcterms.creator	https://orcid.org/0000-0002-5411-2230

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 31770-Article Text-35839-1-2-20241107.pdf
Size:: 492.62 KB
Format:: Adobe Portable Document Format

Download

Collections

UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
UMBC Student Collection