Cube-Evo: A Query-Efficient Black-Box Attack on Video Classification System

Date

2023-04-13

Department

Program

Citation of Original Publication

Y. Zhan et al., "Cube-Evo: A Query-Efficient Black-Box Attack on Video Classification System," in IEEE Transactions on Reliability, doi: 10.1109/TR.2023.3261986.

Rights

© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Subjects

Abstract

The current progressive research in the domain of black-box adversarial attack enhances the reliability of deep neural network (DNN)-based video systems. Recent works mainly carry out black-box adversarial attacks on video systems by query-based parameter dimension reduction. However, the additional temporal dimension of video data leads to massive query consumption and low attack success rate. In this article, we embark on our efforts to design an effective adversarial attack on popular video classification systems. We deeply root the observations that the DNN-based systems are sensitive to adversarial perturbations with high frequency and reconstructed shape. Specifically, we propose a systematic attack pipeline Cube-Evo, aiming to reduce the search space dimension and obtain the effective adversarial perturbation via the optimal parameter group updating. We evaluate the proposed attack pipeline on two popular datasets: UCF101 and JESTER. Our attack pipeline reduces query consumption and achieves a high success rate on various DNN-based video classification systems. Compared with the state-of-the-art method Geo-Trap-Att, our pipeline averagely reduces 1.6× query consumption in untargeted attacks and 2.9× in targeted attacks. Besides, Cube-Evo improves 13% attack success rate on average, achieving new state-of-the-art results over diverse video classification systems.