A Closer Look at Robustness of Vision Transformers to Backdoor Attacks

Subramanya, Akshayvarun; Koohpayegani, Soroush Abbasi; Saha, Aniruddha; Tejankar, Ajinkya; Pirsiavash, Hamed

A Closer Look at Robustness of Vision Transformers to Backdoor Attacks

dc.contributor.author	Subramanya, Akshayvarun
dc.contributor.author	Koohpayegani, Soroush Abbasi
dc.contributor.author	Saha, Aniruddha
dc.contributor.author	Tejankar, Ajinkya
dc.contributor.author	Pirsiavash, Hamed
dc.date.accessioned	2024-01-12T13:10:51Z
dc.date.available	2024-01-12T13:10:51Z
dc.date.issued	2024
dc.description	Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024,
dc.description.abstract	Transformer architectures are based on self-attention mechanism that processes images as a sequence of patches. As their design is quite different compared to CNNs, it is important to take a closer look at their vulnerability to backdoor attacks and how different transformer architectures affect robustness. Backdoor attacks happen when an attacker poisons a small part of the training images with a specific trigger or backdoor which will be activated later. The model performance is good on clean test images, but the attacker can manipulate the decision of the model by showing the trigger on an image at test time. In this paper, we compare state-of-the-art architectures through the lens of backdoor attacks, specifically how attention mechanisms affect robustness. We observe that the well known vision transformer architecture (ViT) is the least robust architecture and ResMLP, which belongs to a class called Feed Forward Networks (FFN), is most robust to backdoor attacks among state-of-the-art architectures. We also find an intriguing difference between transformers and CNNs -- interpretation algorithms effectively highlight the trigger on test images for transformers but not for CNNs. Based on this observation, we find that a test-time image blocking defense reduces the attack success rate by a large margin for transformers. We also show that such blocking mechanisms can be incorporated during the training process to improve robustness even further. We believe our experimental findings will encourage the community to understand the building block components in developing novel architectures robust to backdoor attacks. Code is available here:https://github.com/UCDvision/backdoor_transformer.git
dc.description.sponsorship	This work is partially supported by NSF grants 1845216 and 1920079.
dc.description.uri	https://openaccess.thecvf.com/content/WACV2024/html/Subramanya_A_Closer_Look_at_Robustness_of_Vision_Transformers_to_Backdoor_WACV_2024_paper.html
dc.format.extent	10 pages
dc.genre	conferene papers and proceedings
dc.genre	preprints
dc.identifier.uri	http://hdl.handle.net/11603/31271
dc.language.iso	en_US
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartof	UMBC Student Collection
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.title	A Closer Look at Robustness of Vision Transformers to Backdoor Attacks
dc.type	Text
dcterms.creator	https://orcid.org/0000-0002-5394-7172

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Subramanya_A_Closer_Look_at_Robustness_of_Vision_Transformers_to_Backdoor_WACV_2024_paper.pdf
Size:: 3.34 MB
Format:: Adobe Portable Document Format

Download

Name:: Subramanya_A_Closer_Look_WACV_2024_supplemental.pdf
Size:: 4.44 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Computer Science and Electrical Engineering Department
UMBC Student Collection