AMA: Adaptive Memory Augmentation for Enhancing Image Captioning


Shuang Cheng (Institute of Computing Technology, Chinese Academy of Sciences),* Jian Ye (Institute of Computing Technology, CAS)
The 34th British Machine Vision Conference

Abstract

Memory-Augmented Image Captioning (MA-IC) has demonstrated significant performance improvements over standard neural image captioning systems. It effectively combines a well-trained captioning model with additional explicit knowledge from a memory bank to enhance captioning accuracy. However, the k-nearest neighbor algorithm used in MA-IC retrieves the same number of nearest neighbors for each target token, which may lead to prediction errors when the retrieved neighbors contain noise. In this paper, we propose a adaptive memory feedback mechanism to determine the number of k for each target token. We achieve this by introducing a lightweight network that can be efficiently trained using only a small number of training samples. By incorporating this adaptive memory-augmented method into various captioning baselines, the performance of the resulting captioners consistently improves on the evaluation benchmark. Notably, extensive experiments show that our approach is capable of efficiently adapting to larger training datasets by simply transferring the memory bank with a straightforward network.

Video



Citation

@inproceedings{Cheng_2023_BMVC,
author    = {Shuang Cheng and Jian Ye},
title     = {AMA: Adaptive Memory Augmentation for Enhancing Image Captioning},
booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},
publisher = {BMVA},
year      = {2023},
url       = {https://papers.bmvc2023.org/0460.pdf}
}


Copyright © 2023 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection