You've Changed: Detecting Modification of Black-Box Large Language Models

dc.contributor.authorDima, Alden
dc.contributor.authorFoulds, James
dc.contributor.authorPan, Shimei
dc.contributor.authorFeldman, Philip
dc.date.accessioned2025-06-05T14:03:22Z
dc.date.available2025-06-05T14:03:22Z
dc.date.issued2025-04-14
dc.description.abstractLarge Language Models (LLMs) are often provided as a service via an API, making it challenging for developers to detect changes in their behavior. We present an approach to monitor LLMs for changes by comparing the distributions of linguistic and psycholinguistic features of generated text. Our method uses a statistical test to determine whether the distributions of features from two samples of text are equivalent, allowing developers to identify when an LLM has changed. We demonstrate the effectiveness of our approach using five OpenAI completion models and Meta's Llama 3 70B chat model. Our results show that simple text features coupled with a statistical test can distinguish between language models. We also explore the use of our approach to detect prompt injection attacks. Our work enables frequent LLM change monitoring and avoids computationally expensive benchmark evaluations.
dc.description.urihttp://arxiv.org/abs/2504.12335
dc.format.extent26 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m2cmhq-9hjf
dc.identifier.urihttps://doi.org/10.48550/arXiv.2504.12335
dc.identifier.urihttp://hdl.handle.net/11603/38697
dc.language.isoen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department
dc.relation.ispartofUMBC College of Engineering and Information Technology Dean's Office
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subjectComputer Science - Artificial Intelligence Computer Science - Computation and Language
dc.titleYou've Changed: Detecting Modification of Black-Box Large Language Models
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0003-0935-4182
dcterms.creatorhttps://orcid.org/0000-0002-5989-8543

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
250412335v1.pdf
Size:
690.16 KB
Format:
Adobe Portable Document Format