RBFormer: Improve Adversarial Robustness of Transformers by Robust Bias

Hao Cheng (The Hong Kong University of Science and Technology(Guangzhou)),* Jinhao Duan (Drexel University), Hui Li (Samsung Research and Development Institute China Xi'an), Jiahang Cao (The Hong Kong University of Science and Technology (Guangzhou)), Ping Wang (Xi'an Jiaotong University), Lyutianyang Zhang (University of Washington), Jize Zhang (HKUST), Kaidi Xu (Drexel University), Renjing Xu (The Hong Kong University of Science and Technology (Guangzhou))
The 34th British Machine Vision Conference


Recently, there has been a surge of interest and attention in Transformer-based structures, such as Vision Transformer (ViT) and Vision Multilayer Perceptron (VMLP). Compared with the previous convolution-based structures, the Transformer-based structure under investigation showcases a comparable or superior performance under its distinctive attention-based input token mixer strategy. Introducing adversarial examples as a robustness consideration has had a profound and detrimental impact on the performance of well-established convolution-based structures. This inherent vulnerability to adversarial attacks has also been demonstrated in Transformer-based structures. In this paper, our emphasis lies on investigating the intrinsic robustness of the structure rather than introducing novel defense measures against adversarial attacks. To address the susceptibility to robustness issues, we employ a rational structure design approach to mitigate such vulnerabilities. Specifically, we enhance the adversarial robustness of the structure by increasing the proportion of high-frequency structural robust biases. As a result, we introduce a novel structure called Robust Bias Transformer-based Structure (RBFormer) that shows robust superiority compared to several existing baseline structures. Through a series of extensive experiments, RBFormer outperforms the original structures by a significant margin, achieving an impressive improvement of $+16.12\%$ and $+5.04\%$ across different evaluation criteria on CIFAR-10 and ImageNet-1k, respectively.



author    = {Hao Cheng and Jinhao Duan and Hui Li and Jiahang Cao and Ping Wang and Lyutianyang Zhang and Jize Zhang and Kaidi Xu and Renjing Xu},
title     = {RBFormer: Improve Adversarial Robustness of Transformers by Robust Bias},
booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},
publisher = {BMVA},
year      = {2023},
url       = {https://papers.bmvc2023.org/0296.pdf}

Copyright © 2023 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection