Learning a Single Tucker Decomposition Network for Lossy Image Compression with Multiple Bits-Per-Pixel Rates

Jianrui Cai, Zisheng Cao, Lei Zhang (Corresponding Author)

Research output: Journal article publicationJournal articleAcademic researchpeer-review

29 Citations (Scopus)

Abstract

Lossy image compression (LIC), which aims to utilize inexact approximations to represent an image more compactly, is a classical problem in image processing. Recently, deep convolutional neural networks (CNNs) have achieved interesting results in LIC by learning an encoder-quantizer-decoder network from a large amount of data. However, existing CNN-based LIC methods generally train a network for a specific bits-per-pixel (bpp). Such a 'one-network-per-bpp' problem limits the generality and flexibility of CNNs to practical LIC applications. In this paper, we propose to learn a single CNN which can perform LIC at multiple bpp rates. A simple yet effective Tucker Decomposition Network (TDNet) is developed, where there is a novel tucker decomposition layer (TDL) to decompose a latent image representation into a set of projection matrices and a core tensor. By changing the rank of core tensor and its quantization, we can easily adjust the bpp rate of latent image representation within a single CNN. Furthermore, an iterative non-uniform quantization scheme is presented to optimize the quantizer, and a coarse-to-fine training strategy is introduced to reconstruct the decompressed images. Extensive experiments demonstrate the state-of-the-art compression performance of TDNet in terms of both PSNR and MS-SSIM indices.
Original languageEnglish
Article number8954947
Pages (from-to)3612-3625
Number of pages14
JournalIEEE Transactions on Image Processing
Volume29
DOIs
Publication statusPublished - Jan 2020

Keywords

  • convolutional neural networks
  • Lossy image compression
  • tucker decomposition

Fingerprint

Dive into the research topics of 'Learning a Single Tucker Decomposition Network for Lossy Image Compression with Multiple Bits-Per-Pixel Rates'. Together they form a unique fingerprint.

Cite this