E2SAM: A Pipeline for Efficiently Extending SAM's Capability on Cross-Modality Data via Knowledge Inheritance


Su sundingkai (Beijing University of Posts and Telecommunications), Mengqiu Xu (Beijing University of Posts and Telecommunications), Kaixin Chen (Beijing University of Posts and Telecommunications), Ming Wu (Beijing University of Posts and Telecommunications),* Chuang Zhang (Beijing University of Posts and Telecommunications)
The 34th British Machine Vision Conference

Abstract

SAM has achieved brilliant results on many segmentation datasets due to its strong segmentation capability with visual-grouping perception. However, the limitation of the three-channel input means that it is difficult to apply directly to cross-modality data. Therefore, this paper proposes a pipeline called E2SAM with knowledge inheritance and downstream fine-tuning stages that can efficiently inherit the capabilities of SAM and extend relevant task-specific applications. In order to enable the feature alignment of varying single-modality to cross-modality data, an auxiliary branch with a channel selector and a merge module is designed in the first stage. It is worth noting that we do not need a large amount of additional training data during our pipeline. Furthermore, the strengths of the proposed method are discussed in detail through experiments on generalization performance and resistance to size changes. The experimental results and visualizations on the SFDD-H8 and SHIFT datasets demonstrate the effectiveness of our proposed methods compared to other methods such as random initialization and SAM-based fine-tuning. The code is available at https://github.com/BUPT-PRIS-727/BMVC2023_E2SAM.

Video



Citation

@inproceedings{sundingkai_2023_BMVC,
author    = {Su sundingkai and Mengqiu Xu and Kaixin Chen and Ming Wu and Chuang Zhang},
title     = {E2SAM: A Pipeline for Efficiently Extending SAM's Capability on Cross-Modality Data via Knowledge Inheritance},
booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},
publisher = {BMVA},
year      = {2023},
url       = {https://papers.bmvc2023.org/0666.pdf}
}


Copyright © 2023 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection