Measuring Moral Inconsistencies in Large Language Models

dc.contributor.authorBonagiri, Vamshi Krishna
dc.contributor.authorVennam, Sreeram
dc.contributor.authorGaur, Manas
dc.contributor.authorKumaraguru, Ponnurangam
dc.date.accessioned2024-02-19T20:24:44Z
dc.date.available2024-02-19T20:24:44Z
dc.date.issued2024-01-26
dc.description.abstractA Large Language Model~(LLM) is considered consistent if semantically equivalent prompts produce semantically equivalent responses. Despite recent advancements showcasing the impressive capabilities of LLMs in conversational systems, we show that even state-of-the-art LLMs are highly inconsistent in their generations, questioning their reliability. Prior research has tried to measure this with task-specific accuracies. However, this approach is unsuitable for moral scenarios, such as the trolley problem, with no ``correct'' answer. To address this issue, we propose a novel information-theoretic measure called Semantic Graph Entropy~(SGE) to measure the consistency of an LLM in moral scenarios. We leverage ``Rules of Thumb''~(RoTs) to explain a model's decision-making strategies and further enhance our metric. Compared to existing consistency metrics, SGE correlates better with human judgments across five LLMs. In the future, we aim to investigate the root causes of LLM inconsistencies and propose improvements.
dc.description.urihttps://arxiv.org/abs/2402.01719
dc.format.extent4 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m2eshr-50j6
dc.identifier.urihttps://doi.org/10.48550/arXiv.2402.01719
dc.identifier.urihttp://hdl.handle.net/11603/31670
dc.language.isoen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rightsCC BY 4.0 DEED Attribution 4.0 International en
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleMeasuring Moral Inconsistencies in Large Language Models
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0002-5411-2230

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2402.01719.pdf
Size:
277.42 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: