Efficient video super-resolution via hierarchical temporal residual networks

Zhi Song Liu, Wan Chi Siu, Yui Lam Chan

Research output: Journal article publicationJournal articleAcademic researchpeer-review

3 Citations (Scopus)


Super-Resolving (SR) video is more challenging compared with image super-resolution because of the demanding computation time. To enlarge a low-resolution video, the temporal relationship among frames must be fully exploited. We can model video SR as a multi-frame SR problem and use deep learning methods to estimate the spatial and temporal information. This paper proposes a lighter residual network, based on a multi-stage back projection for multi-frame SR. We improve the back projection based residual block by adding weights for adaptive feature tuning, and add global & local connections to explore deeper feature representation. We jointly learn spatial-temporal feature maps by using the proposed Spatial Convolution Packing scheme as an attention mechanism to extract more information from both spatial and temporal domains. Different from others, our proposed network can input multiple low-resolution frames to obtain multiple super-resolved frames simultaneously. We can then further improve the video SR quality by self-ensemble enhancement to meet videos with different motions and distortions. Results of much experimental work show that our proposed approaches give large improvement over other state-of-the-art video SR methods. Compared to recent CNN based video SR works, our approaches can save, up to 60% computation time and achieve 0.6 dB PSNR improvement.

Original languageEnglish
Article number9490661
Pages (from-to)106049-106064
Number of pages16
JournalIEEE Access
Publication statusPublished - Jul 2021


  • Deep learning
  • Hierarchical structure
  • Residual network
  • Super-resolution
  • Video

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering


Dive into the research topics of 'Efficient video super-resolution via hierarchical temporal residual networks'. Together they form a unique fingerprint.

Cite this