RUPQ: Improving low-bit quantization by equalizing relative updates of quantization parameters


Valentin Buchnev (Huawei Technologies Co. Ltd.),* Jiao He (huawei company), Fengyu Sun (Huawei), Ivan Koryakovskiy (Huawei Technologies Co., Ltd.)
The 34th British Machine Vision Conference

Abstract

Neural network quantization is a model compression technique using low-bit width representation of floating-point weights and activations. Although quantization can significantly reduce power consumption and inference time, it often leads to worse target metrics, such as accuracy or peak signal-to-noise ratio. To improve upon the target metrics, particularly at low-bit width computations, we propose a new Relative Update-Preserving Quantizer. This quantizer stabilizes relative updates for all parameters during training. Our experimental results show that the proposed quantizer performs consistently better than LSQ+ and, compared to full-precision models, significantly reduces the gap in quality for low-bit SRResNet, EDSR and YOLO-v3 models. The method is easy to implement and use in practice.

Video



Citation

@inproceedings{Buchnev_2023_BMVC,
author    = {Valentin Buchnev and Jiao He and Fengyu Sun and Ivan Koryakovskiy},
title     = {RUPQ: Improving low-bit quantization by equalizing relative updates of quantization parameters},
booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},
publisher = {BMVA},
year      = {2023},
url       = {https://papers.bmvc2023.org/0307.pdf}
}


Copyright © 2023 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection