DeepliteRT: Computer Vision at the Edge

Saad Ashfaq (Deeplite),* Alexander Hoffman (McGill University), SAPTARSHI MITRA (Deeplite Inc.), Sudhakar Sah (Deeplite Inc), MohammadHossein AskariHemmat (Polytechnique Montreal), Ehsan Saboori (Deeplite Inc.)
The 34th British Machine Vision Conference


The proliferation of edge devices has unlocked unprecedented opportunities for deep learning model deployment in computer vision applications. However, these complex models require considerable power, memory and compute resources that are typically not available on edge platforms. Ultra low-bit quantization presents an attractive solution to this problem by scaling down the model weights and activations from 32-bit to less than 8-bit. We implement highly optimized ultra low-bit convolution operators for ARM-based targets that outperform existing methods by up to 4.34x. Our operator is implemented within Deeplite Runtime (DeepliteRT), an end-to-end solution for the compilation, tuning, and inference of ultra low-bit models on ARM devices. Compiler passes in DeepliteRT automatically convert a fake-quantized model in full precision to a compact ultra low-bit representation, easing the process of quantized model deployment on commodity hardware. We analyze the performance of DeepliteRT on classification and detection models against optimized 32-bit floating-point, 8-bit integer, and 2-bit baselines, achieving significant speedups of up to 2.20x, 2.33x and 2.17x, respectively.



author    = {Saad Ashfaq and Alexander Hoffman and SAPTARSHI MITRA and Sudhakar Sah and MohammadHossein AskariHemmat and Ehsan Saboori},
title     = {DeepliteRT: Computer Vision at the Edge},
booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},
publisher = {BMVA},
year      = {2023},
url       = {}

