Spherical Vision Transformer for 360° Video Saliency Prediction


Mert Cokelek (Koç University),* Nevrez Imamoglu (AIST), Cagri Ozcinar (Samsung), Erkut Erdem (Hacettepe University), Aykut Erdem (Koc University)
The 34th British Machine Vision Conference

Abstract

The growing interest in omnidirectional videos (ODVs) that capture the full field-of-view (FOV) has elevated the importance of 360◦ saliency prediction in computer vision. However, predicting where humans look in 360◦ scenes presents unique challenges, including spherical distortion, high resolution, and limited labelled data. We propose a novel vision-transformer-based model for omnidirectional videos named SalViT360 that leverages tangent image representations. We introduce a spherical geometry-aware spatio-temporal self-attention mechanism that is capable of effective omnidirectional video understanding. Furthermore, we present a consistency-based unsupervised regularisation term for projection-based 360◦ dense-prediction models to reduce artefacts in the predictions that occur after inverse projection. Our approach is the first to employ tangent images for omnidirectional saliency prediction, and our experimental results on three ODV saliency datasets demonstrate its effectiveness compared to the state-of-the-art. Code and models are available at: https://github.com/MertCokelek/SalViT360.

Video



Citation

@inproceedings{Cokelek_2023_BMVC,
author    = {Mert Cokelek and Nevrez Imamoglu and Cagri Ozcinar and Erkut Erdem and Aykut Erdem},
title     = {Spherical Vision Transformer for 360° Video Saliency Prediction},
booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},
publisher = {BMVA},
year      = {2023},
url       = {https://papers.bmvc2023.org/0317.pdf}
}


Copyright © 2023 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection