TY - JOUR
T1 - Learning a Single Tucker Decomposition Network for Lossy Image Compression with Multiple Bits-Per-Pixel Rates
AU - Cai, Jianrui
AU - Cao, Zisheng
AU - Zhang, Lei
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2020/1
Y1 - 2020/1
N2 - Lossy image compression (LIC), which aims to utilize inexact approximations to represent an image more compactly, is a classical problem in image processing. Recently, deep convolutional neural networks (CNNs) have achieved interesting results in LIC by learning an encoder-quantizer-decoder network from a large amount of data. However, existing CNN-based LIC methods generally train a network for a specific bits-per-pixel (bpp). Such a 'one-network-per-bpp' problem limits the generality and flexibility of CNNs to practical LIC applications. In this paper, we propose to learn a single CNN which can perform LIC at multiple bpp rates. A simple yet effective Tucker Decomposition Network (TDNet) is developed, where there is a novel tucker decomposition layer (TDL) to decompose a latent image representation into a set of projection matrices and a core tensor. By changing the rank of core tensor and its quantization, we can easily adjust the bpp rate of latent image representation within a single CNN. Furthermore, an iterative non-uniform quantization scheme is presented to optimize the quantizer, and a coarse-to-fine training strategy is introduced to reconstruct the decompressed images. Extensive experiments demonstrate the state-of-the-art compression performance of TDNet in terms of both PSNR and MS-SSIM indices.
AB - Lossy image compression (LIC), which aims to utilize inexact approximations to represent an image more compactly, is a classical problem in image processing. Recently, deep convolutional neural networks (CNNs) have achieved interesting results in LIC by learning an encoder-quantizer-decoder network from a large amount of data. However, existing CNN-based LIC methods generally train a network for a specific bits-per-pixel (bpp). Such a 'one-network-per-bpp' problem limits the generality and flexibility of CNNs to practical LIC applications. In this paper, we propose to learn a single CNN which can perform LIC at multiple bpp rates. A simple yet effective Tucker Decomposition Network (TDNet) is developed, where there is a novel tucker decomposition layer (TDL) to decompose a latent image representation into a set of projection matrices and a core tensor. By changing the rank of core tensor and its quantization, we can easily adjust the bpp rate of latent image representation within a single CNN. Furthermore, an iterative non-uniform quantization scheme is presented to optimize the quantizer, and a coarse-to-fine training strategy is introduced to reconstruct the decompressed images. Extensive experiments demonstrate the state-of-the-art compression performance of TDNet in terms of both PSNR and MS-SSIM indices.
KW - convolutional neural networks
KW - Lossy image compression
KW - tucker decomposition
UR - http://www.scopus.com/inward/record.url?scp=85079619123&partnerID=8YFLogxK
U2 - 10.1109/TIP.2020.2963956
DO - 10.1109/TIP.2020.2963956
M3 - Journal article
SN - 1057-7149
VL - 29
SP - 3612
EP - 3625
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
M1 - 8954947
ER -