Side Effects of Erasing Concepts from Diffusion Models

dc.contributor.authorSaha, Shaswati
dc.contributor.authorSaha, Sourajit
dc.contributor.authorGaur, Manas
dc.contributor.authorGokhale, Tejas
dc.date.accessioned2025-09-18T14:22:25Z
dc.date.issued2025-08-24
dc.descriptionFindings of the Association for Computational Linguistics EMNLP 2025, Vienna, Austria, July 27–August 1st, 2025
dc.description.abstractConcerns about text-to-image (T2I) generative models infringing on privacy, copyright, and safety have led to the development of Concept Erasure Techniques (CETs). The goal of an effective CET is to prohibit the generation of undesired "target" concepts specified by the user, while preserving the ability to synthesize high-quality images of the remaining concepts. In this work, we demonstrate that CETs can be easily circumvented and present several side effects of concept erasure. For a comprehensive measurement of the robustness of CETs, we present Side Effect Evaluation (SEE), an evaluation benchmark that consists of hierarchical and compositional prompts that describe objects and their attributes. This dataset and our automated evaluation pipeline quantify side effects of CETs across three aspects: impact on neighboring concepts, evasion of targets, and attribute leakage. Our experiments reveal that CETs can be circumvented by using superclass-subclass hierarchy and semantically similar prompts, such as compositional variants of the target. We show that CETs suffer from attribute leakage and counterintuitive phenomena of attention concentration or dispersal. We release our dataset, code, and evaluation tools to aid future work on robust concept erasure.
dc.description.urihttp://arxiv.org/abs/2508.15124
dc.format.extent20 pages
dc.genreconference papers and proceedings
dc.genrepreprints
dc.identifierdoi:10.13016/m2abpd-mwln
dc.identifier.urihttps://doi.org/10.48550/arXiv.2508.15124
dc.identifier.urihttp://hdl.handle.net/11603/40239
dc.language.isoen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.relation.ispartofUMBC Information Systems Department
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectUMBC Ebiquity Research Group
dc.subjectComputer Science - Computer Vision and Pattern Recognition
dc.subjectComputer Science - Machine Learning
dc.titleSide Effects of Erasing Concepts from Diffusion Models
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0003-1357-7813
dcterms.creatorhttps://orcid.org/0000-0002-5411-2230
dcterms.creatorhttps://orcid.org/0000-0002-5593-2804

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2508.15124v2.pdf
Size:
6.67 MB
Format:
Adobe Portable Document Format