Building A Mobile Text Recognizer via Truncated SVD-based Knowledge Distillation-Guided NAS


Weifeng Lin (South China University of Technology), Canyu Xie (South China University of Technology), Dezhi Peng (South China University of Technology), Jiapeng Wang (South China University of Technology), Lianwen Jin (South China University of Technology),* Wei Ding (Alibaba Group), Cong Yao (Alibaba DAMO Academy), Mengchao He (DAMO Academy, Alibaba Group)
The 34th British Machine Vision Conference

Abstract

Text recognition poses significant challenges in computer vision, with unresolved issues such as the trade-off between recognition accuracy, storage, and computation complexity in real-world applications. To tackle this challenge, we propose a mobile text recognizer that integrates Truncated Singular Value Decomposition (TSVD)-based Knowledge Distillation (KD) into the Neural Architecture Search (NAS) process. We also improved the search space of NAS by introducing a novel Mobile Char Block (MCB) and a channel-aware search. We conducted a series of experiments that demonstrated the efficacy of our search strategy in identifying a lightweight model that achieved comparable accuracy to existing methods but with significantly lower computation costs and smaller storage space. We evaluated our model on four benchmark datasets, including IAM, ICDAR2013, and SCUT-HCCDoc, for handwriting recognition, and on JS-Printed, a large-scale in-house bilingual dataset of printed documents. Our student model even outperformed the teacher model with 10.5x faster and 8.2x smaller on x86 and ARM devices on the widely used IAM dataset.

Video



Citation

@inproceedings{Lin_2023_BMVC,
author    = {Weifeng Lin and Canyu Xie and Dezhi Peng and Jiapeng Wang and Lianwen  Jin and Wei Ding and Cong Yao and Mengchao He},
title     = {Building A Mobile Text Recognizer via Truncated SVD-based Knowledge Distillation-Guided NAS},
booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},
publisher = {BMVA},
year      = {2023},
url       = {https://papers.bmvc2023.org/0375.pdf}
}


Copyright © 2023 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection