Fooling Network Interpretation in Image Classification

Subramanya, Akshayvarun; Pillai, Vipin; Pirsiavash, Hamed

Fooling Network Interpretation in Image Classification

dc.contributor.author	Subramanya, Akshayvarun
dc.contributor.author	Pillai, Vipin
dc.contributor.author	Pirsiavash, Hamed
dc.date.accessioned	2020-03-11T18:12:18Z
dc.date.available	2020-03-11T18:12:18Z
dc.date.issued	2019-09-24
dc.description.abstract	Deep neural networks have been shown to be fooled rather easily using adversarial attack algorithms. Practical methods such as adversarial patches have been shown to be extremely effective in causing misclassification. However, these patches are highlighted using standard network interpretation algorithms, thus revealing the identity of the adversary. We show that it is possible to create adversarial patches which not only fool the prediction, but also change what we interpret regarding the cause of the prediction. Moreover, we introduce our attack as a controlled setting to measure the accuracy of interpretation algorithms. We show this using extensive experiments for Grad-CAM interpretation that transfers to occluding patch interpretation as well. We believe our algorithms can facilitate developing more robust network interpretation tools that truly explain the network's underlying decision making process.	en_US
dc.description.sponsorship	This work was performed under the following financial assistance award: 60NANB18D279 from U.S. Department of Commerce, National Institute of Standards and Technology, funding from SAP SE, and also NSF grant 1845216.	en_US
dc.description.uri	https://arxiv.org/abs/1812.02843	en_US
dc.format.extent	18 pages	en_US
dc.genre	journal articles preprints	en_US
dc.identifier	doi:10.13016/m22dqb-1n5i
dc.identifier.citation	Subramanya, Akshayvarun; Pillai, Vipin; Pirsiavash, Hamed; Fooling Network Interpretation in Image Classification; Computer Vision and Pattern Recognition (2019); https://arxiv.org/abs/1812.02843	en_US
dc.identifier.uri	http://hdl.handle.net/11603/17552
dc.language.iso	en_US	en_US
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartof	UMBC Student Collection
dc.relation.ispartof	UMBC Faculty Collection
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subject	deep neural networks	en_US
dc.subject	algorithms	en_US
dc.subject	adversarial patches	en_US
dc.subject	misclassification	en_US
dc.title	Fooling Network Interpretation in Image Classification	en_US
dc.type	Text	en_US

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
UMBC Student Collection