Loop transforming for reducing data alignment on multi-core SIMD processors

Yi Wang, Linfeng Pan, Zili Shao, Yong Guan, Minyi Guo

Research output: Journal article publicationJournal articleAcademic researchpeer-review


Multimedia SIMD extensions are commonly employed today to speed up media processing. When performing vectorization for SIMD architectures, one of the major issues is to handle the problem of memory alignment. Prior study focused on either vectorizing loops with all memory references being properly aligned, or introducing extra operations to deal with the misaligned memory references. On the other hand, multi-core SIMD architectures require coarse-grain parallelism. Therefore, it is an important problem to study how to parallelize and vectorize loop nests with the awareness of data misalignments. This paper presents a loop transformation scheme that maximizes the parallelism of outermost loops, while the misaligned memory references in innermost loops are reduced. The basic idea of our technique is to align each level of loops in the nest, considering the constraint of dependence relations. To reduce the data misalignments, we establish a mathematical model with a concept of offset-collection and propose an effective heuristic algorithm. For coarser-grain parallelism, we propose some rules to analyze the outermost loop. When transformations are applied, the inner loops are involved to maximize the parallelism. To avoid introducing more data misalignments, the involved innermost loop is handled from other levels of loops. Experimental results show that 7 % to 37 % (on average 18.4 %) misaligned memory references can be reduced. The simulations on CELL show that 1.1x speedup can be reached by reducing the misaligned data, while 6.14x speedup can be achieved by enhancing the parallelism for multi-core.
Original languageEnglish
Pages (from-to)137-150
Number of pages14
JournalJournal of Signal Processing Systems
Issue number2
Publication statusPublished - 1 Feb 2014


  • Data Alignment
  • Loop Transformation
  • Multi-Processors
  • SIMD Architecture

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Theoretical Computer Science
  • Signal Processing
  • Information Systems
  • Modelling and Simulation
  • Hardware and Architecture

Cite this