Incremental Network Quantization: Towards Lossless CNNs with LowPrecision Weights
This topic contains 0 replies, has 1 voice, and was last updated by arXiv 1 year, 7 months ago.

Incremental Network Quantization: Towards Lossless CNNs with LowPrecision Weights
This paper presents incremental network quantization (INQ), a novel method, targeting to efficiently convert any pretrained fullprecision convolutional neural network (CNN) model into a lowprecision version whose weights are constrained to be either powers of two or zero. Unlike existing methods which are struggled in noticeable accuracy loss, our INQ has the potential to resolve this issue, as benefiting from two innovations. On one hand, we introduce three interdependent operations, namely weight partition, groupwise quantization and retraining. A wellproven measure is employed to divide the weights in each layer of a pretrained CNN model into two disjoint groups. The weights in the first group are responsible to form a lowprecision base, thus they are quantized by a variablelength encoding method. The weights in the other group are responsible to compensate for the accuracy loss from the quantization, thus they are the ones to be retrained. On the other hand, these three operations are repeated on the latest retrained group in an iterative manner until all the weights are converted into lowprecision ones, acting as an incremental network quantization and accuracy enhancement procedure. Extensive experiments on the ImageNet classification task using almost all known deep CNN architectures including AlexNet, VGG16, GoogleNet and ResNets well testify the efficacy of the proposed method. Specifically, at 5bit quantization, our models have improved accuracy than the 32bit floatingpoint references. Taking ResNet18 as an example, we further show that our quantized models with 4bit, 3bit and 2bit ternary weights have improved or very similar accuracy against its 32bit floatingpoint baseline. Besides, impressive results with the combination of network pruning and INQ are also reported. The code is available at https://github.com/Zhouaojun/IncrementalNetworkQuantization.
Incremental Network Quantization: Towards Lossless CNNs with LowPrecision Weights
by Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, Yurong Chen
https://arxiv.org/pdf/1702.03044v2.pdf
You must be logged in to reply to this topic.