Space-to-speed architecture supporting acceleration on VHR image processing

Shenlu Jiang, Yuliya Tarabalka, Wei Yao, Zhonghua Hong, Guofu Feng

Research output: Journal article publicationJournal articleAcademic researchpeer-review

3 Citations (Scopus)


One of the major focuses in the remote sensing community is the rapid processing of deep neural networks (DNNs) on very high resolution (VHR) aerial images. Few studies have investigated the acceleration of training and prediction by optimizing the architecture of the DNN system rather than designing a lightweight DNN. Parallel processing using multiple graphics processing units (GPUs) increases VHR image processing performance. It drives extremely large and frequent data transfers (input/output(I/O)) from random access memory (RAM) to GPU memory. As a result, the system bus congestion causes the system to hang, resulting in long latency in training/predicting. In this paper, we evaluate the causes of long latency and propose a space-to-speed (S2S) DNN system to overcome the aforementioned challenges. A three-level memory system aiming to reduce data transfer during system operation is presented. Distribution optimization with parallel processing was used to accelerate the training. Training optimizations on VHR images (such as hot-zone searching and image/ground truth queues for data saving) were used to train the VHR images efficiently. Inference optimization was performed to speed up prediction in the release mode. To verify the efficiency of the proposed system, we used aerial image labeling from the Institut National de Recherche en Informatique et en Automatique (INRIA) and benchmarks from the Massachusetts Institute of Technology Aerial Imagery for Roof Segmentation (MITAIRS) to test the system performance and accuracy. Without the loss of accuracy, the S2S system improved prediction speed on the testing dataset by eight GPUs in a normal setting in both the INRIA dataset (from 534 to 72 s) and the MITAIRS dataset (818 to 120 s). With the prediction in half-float (using float-16 data), an 8-GPU parallel processing increased the speed to 38 s in the INRIA dataset and 83 s in the MITAIRS dataset. In a pressure test, our proposed system operated on 18,000 images with a size of 5000 × 5000 from 18.2 to 1.8 h with the prediction in full-float (using float-32 data) and 43 min with the prediction in half-float, increasing the speed by a factor of 9.78 and 25.3, respectively, when compared to system runs without optimization.

Original languageEnglish
Pages (from-to)30-44
Number of pages15
JournalISPRS Journal of Photogrammetry and Remote Sensing
Publication statusPublished - Apr 2023


  • Building segmentation
  • Deep neural networks (DNNs)
  • Space-to-speed architecture
  • Very high-resolution (VHR) aerial images

ASJC Scopus subject areas

  • Atomic and Molecular Physics, and Optics
  • Engineering (miscellaneous)
  • Computer Science Applications
  • Computers in Earth Sciences


Dive into the research topics of 'Space-to-speed architecture supporting acceleration on VHR image processing'. Together they form a unique fingerprint.

Cite this