GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models

dc.contributor.authorZhang, Tao
dc.contributor.authorZeng, Ziqian
dc.contributor.authorXiao, Yuxiang
dc.contributor.authorZhuang, Huiping
dc.contributor.authorChen, Cen
dc.contributor.authorFoulds, James
dc.contributor.authorPan, Shimei
dc.date.accessioned2025-02-13T17:56:14Z
dc.date.available2025-02-13T17:56:14Z
dc.date.issued2024-12-16
dc.description.abstractLarge Language Models (LLMs) are prone to generating content that exhibits gender biases, raising significant ethical concerns. Alignment, the process of fine-tuning LLMs to better align with desired behaviors, is recognized as an effective approach to mitigate gender biases. Although proprietary LLMs have made significant strides in mitigating gender bias, their alignment datasets are not publicly available. The commonly used and publicly available alignment dataset, HH-RLHF, still exhibits gender bias to some extent. There is a lack of publicly available alignment datasets specifically designed to address gender bias. Hence, we developed a new dataset named GenderAlign, aiming at mitigating a comprehensive set of gender biases in LLMs. This dataset comprises 8k single-turn dialogues, each paired with a "chosen" and a "rejected" response. Compared to the "rejected" responses, the "chosen" responses demonstrate lower levels of gender bias and higher quality. Furthermore, we categorized the gender biases in the "rejected" responses of GenderAlign into 4 principal categories. The experimental results show the effectiveness of GenderAlign in reducing gender bias in LLMs.
dc.description.urihttp://arxiv.org/abs/2406.13925
dc.format.extent17 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m2lani-bx9l
dc.identifier.urihttps://doi.org/10.48550/arXiv.2406.13925
dc.identifier.urihttp://hdl.handle.net/11603/37704
dc.language.isoen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department
dc.relation.ispartofUMBC College of Engineering and Information Technology Dean's Office
dc.relation.ispartofUMBC Information Systems Department
dc.relation.ispartofUMBC Faculty Collection
dc.rightsAttribution 4.0 International CC BY 4.0 Deed
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectComputer Science - Computation and Language
dc.subjectComputer Science - Artificial Intelligence
dc.titleGenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0003-0935-4182
dcterms.creatorhttps://orcid.org/0000-0002-5989-8543

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2406.13925v3.pdf
Size:
1.46 MB
Format:
Adobe Portable Document Format