Backdoor Attacks on Self-Supervised Learning

dc.contributor.authorSaha, Aniruddha
dc.contributor.authorTejankar, Ajinkya
dc.contributor.authorKoohpayegani, Soroush Abbasi
dc.contributor.authorPirsiavash, Hamed
dc.date.accessioned2021-10-13T18:23:49Z
dc.date.available2021-10-13T18:23:49Z
dc.date.issued2021-05-21
dc.description2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
dc.description.abstractLarge-scale unlabeled data has allowed recent progress in self-supervised learning methods that learn rich visual representations. State-of-the-art self-supervised methods for learning representations from images (MoCo and BYOL) use an inductive bias that different augmentations (e.g. random crops) of an image should produce similar embeddings. We show that such methods are vulnerable to backdoor attacks where an attacker poisons a part of the unlabeled data by adding a small trigger (known to the attacker) to the images. The model performance is good on clean test images but the attacker can manipulate the decision of the model by showing the trigger at test time. Backdoor attacks have been studied extensively in supervised learning and to the best of our knowledge, we are the first to study them for self-supervised learning. Backdoor attacks are more practical in self-supervised learning since the unlabeled data is large and as a result, an inspection of the data to avoid the presence of poisoned data is prohibitive. We show that in our targeted attack, the attacker can produce many false positives for the target category by using the trigger at test time. We also propose a knowledge distillation based defense algorithm that succeeds in neutralizing the attack. Our code is available here: this https URL .en_US
dc.description.sponsorshipThis material is based upon work partially supported by the United States Air Force under Contract No. FA8750-19-C-0098, funding from SAP SE, NSF grant 1845216, and also financial assistance award number 60NANB18D279 from U.S. Department of Commerce, National Institute of Standards and Technology. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the United States Air Force, DARPA, or other funding agencies.en_US
dc.description.urihttps://www.computer.org/csdl/proceedings-article/cvpr/2022/694600n3327/1H0L12ox344en_US
dc.format.extent11 pagesen_US
dc.genreconference papers and proceedingsen_US
dc.genrepreprintsen_US
dc.identifierdoi:10.13016/m2djnb-qbvh
dc.identifier.citationA. Saha, A. Tejankar, S. Koohpayegani and H. Pirsiavash, "Backdoor Attacks on Self-Supervised Learning," in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022 pp. 13327-13336. doi: 10.1109/CVPR52688.2022.01298
dc.identifier.urihttp://hdl.handle.net/11603/23089
dc.identifier.urihttps://doi.ieeecomputersociety.org/10.1109/CVPR52688.2022.01298
dc.language.isoen_USen_US
dc.publisherIEEE
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rights© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.titleBackdoor Attacks on Self-Supervised Learningen_US
dc.typeTexten_US
dcterms.creatorhttps://orcid.org/0000-0002-5394-7172en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2105.10123.pdf
Size:
18.27 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: