Are vision transformers more robust than CNNs for Backdoor attacks?
dc.contributor.author | Subramanya, Akshayvarun | |
dc.contributor.author | Saha, Aniruddha | |
dc.contributor.author | Koohpayegani, Soroush Abbasi | |
dc.contributor.author | Tejankar, Ajinkya | |
dc.contributor.author | Pirsiavash, Hamed | |
dc.date.accessioned | 2023-11-10T14:20:45Z | |
dc.date.available | 2023-11-10T14:20:45Z | |
dc.date.issued | 2023-02-13 | |
dc.description | ICLR 2023 Eleventh International Conference on Learning Representations; Kigali, Rwanda; Mon May 1 — Fri May 5, 2023 | en_US |
dc.description.abstract | Transformer architectures are based on a self-attention mechanism that processes images as a sequence of patches. As their design is quite different compared to CNNs, it is interesting to study if transformers are vulnerable to backdoor attacks and how different transformer architectures affect attack success rates. Backdoor attacks happen when an attacker poisons a small part of the training images with a specific trigger or backdoor which will be activated later. The model performance is good on clean test images, but the attacker can manipulate the decision of the model by showing the trigger on an image at test time. In this paper, we perform a comparative study of state-of-the-art architectures through the lens of backdoor robustness, specifically how attention mechanisms affect robustness. We show that the popular vision transformer architecture (ViT) is the least robust architecture and ResMLP, which belongs to a class called Feed Forward Networks (FFN), is the most robust one to backdoor attacks among state-of-the-art architectures. We also find an intriguing difference between transformers and CNNs – interpretation algorithms effectively highlight the trigger on test images for transformers but not for CNNs. Based on this observation, we find that a test-time image blocking defense reduces the attack success rate by a large margin for transformers. We also show that such blocking mechanisms can be incorporated during the training process to improve robustness even further. We believe our experimental findings will encourage the community to understand the building block components in developing novel architectures robust to backdoor attacks. | en_US |
dc.description.uri | https://openreview.net/forum?id=7P_yIFi6zaA | en_US |
dc.format.extent | 12 pages | en_US |
dc.genre | conference papers and proceedings | en_US |
dc.genre | preprints | en_US |
dc.identifier | doi:10.13016/m2mnnl-dtgu | |
dc.identifier.uri | http://hdl.handle.net/11603/30672 | |
dc.language.iso | en_US | en_US |
dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department Collection | |
dc.relation.ispartof | UMBC Faculty Collection | |
dc.rights | This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author. | en_US |
dc.title | Are vision transformers more robust than CNNs for Backdoor attacks? | en_US |
dc.type | Text | en_US |
Files
Original bundle
1 - 2 of 2
Loading...
- Name:
- 2075_are_vision_transformers_more_r.pdf
- Size:
- 3.83 MB
- Format:
- Adobe Portable Document Format
- Description:
No Thumbnail Available
- Name:
- 2075_are_vision_transformers_more_r-Supplementary Material.zip
- Size:
- 4.25 MB
- Format:
- Unknown data format
- Description:
- Supplementary material
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 2.56 KB
- Format:
- Item-specific license agreed upon to submission
- Description: