SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers

Guoqiang Jin (SenseTime Research),* Fan Yang (中国科学院自动化研究所), Mingshan Sun (SenseTime Research ), Ruyi Zhao (Tongji University), Yakun Liu (SenseTime Research), Wei Li (SenseTime Research), Tianpeng Bao (SenseTime Research), Liwei Wu (SenseTime Research), Xingyu ZENG (SenseTime Group Limited), Rui Zhao (SenseTime Group Limited)
The 34th British Machine Vision Conference


Self-supervised pre-training and transformer-based architectures have significantly enhanced object detection performance. However, most current self-supervised object detection methods are built on convolutional-based architectures. We believe the transformers' sequence characteristics should be considered when designing a transformer-based self-supervised method for the object detection task. To this end, we propose SeqCo-DETR, a novel Sequence Consistency-based self-supervised method for object DEtection with TRansformers. SeqCo-DETR defines a simple yet effective pretext by minimizing the discrepancy of the output sequences of transformers with different image views as input and leveraging bipartite matching to find the most relevant sequence pairs which predict the same object. Furthermore, we provide a complementary mask strategy incorporated with the sequence consistency strategy to extract more representative contextual information about the object for the object detection task. Our method achieves state-of-the-art results on MS COCO (45.8 AP) and PASCAL VOC (64.1 AP), demonstrating the effectiveness of our approach.



author    = {Guoqiang Jin and Fan Yang and Mingshan Sun and Ruyi Zhao and Yakun Liu and Wei Li and Tianpeng Bao and Liwei Wu and Xingyu ZENG and Rui Zhao},
title     = {SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers},
booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},
publisher = {BMVA},
year      = {2023},
url       = {}

Copyright © 2023 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection