TY - JOUR
T1 - Learning Content-Weighted Deep Compression
AU - Li, Mu
AU - Zuo, Wangmeng
AU - Gu, Shuhang
AU - You, Jia
AU - Zhang, Dapeng
PY - 2021/10
Y1 - 2021/10
N2 - Learning-based lossy image compression usually involves the joint optimization of rate-distortion performance, and requires to cope with the spatial variation of image content and contextual dependence among learned codes. Traditional entropy models can spatially adapt the local bit rate based on the image content, but usually are limited in exploiting context in code space. On the other hand, most deep context models are computationally very expensive and cannot efficiently perform decoding over the symbols in parallel. In this paper, we present a content-weighted encoder-decoder model, where the channel-wise multi-valued quantization is deployed for the discretization of the encoder features, and an importance map subnet is introduced to generate the importance masks for spatially varying code pruning. Consequently, the summation of importance masks can serve as an upper bound of the length of bitstream. Furthermore, the quantized representations of the learned code and importance map are still spatially dependent, which can be losslessly compressed using arithmetic coding. To compress the codes effectively and efficiently, we propose an upper-triangular masked convolutional network (triuMCN) for large context modeling. Experiments show that the proposed method can produce visually much better results, and performs favorably against deep and traditional lossy image compression approaches.
AB - Learning-based lossy image compression usually involves the joint optimization of rate-distortion performance, and requires to cope with the spatial variation of image content and contextual dependence among learned codes. Traditional entropy models can spatially adapt the local bit rate based on the image content, but usually are limited in exploiting context in code space. On the other hand, most deep context models are computationally very expensive and cannot efficiently perform decoding over the symbols in parallel. In this paper, we present a content-weighted encoder-decoder model, where the channel-wise multi-valued quantization is deployed for the discretization of the encoder features, and an importance map subnet is introduced to generate the importance masks for spatially varying code pruning. Consequently, the summation of importance masks can serve as an upper bound of the length of bitstream. Furthermore, the quantized representations of the learned code and importance map are still spatially dependent, which can be losslessly compressed using arithmetic coding. To compress the codes effectively and efficiently, we propose an upper-triangular masked convolutional network (triuMCN) for large context modeling. Experiments show that the proposed method can produce visually much better results, and performs favorably against deep and traditional lossy image compression approaches.
U2 - 10.1109/TPAMI.2020.2983926
DO - 10.1109/TPAMI.2020.2983926
M3 - Journal article
SN - 0162-8828
VL - 43
SP - 3446
EP - 3461
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 10
ER -