TY - GEN
T1 - SIMD-efficient loop unrolling design for embedded multimedia applications
AU - Dai, Yunyang
AU - Li, Qing
AU - Zhang, Qi
AU - Jay Kuo, C. C.
PY - 2004/12/1
Y1 - 2004/12/1
N2 - Due to the rising complexity of modern embedded media applications (EMAs), compilers must have the capability to exploit the superword level parallelism (SLP). This work analyzes the memory access patterns found in EMAs and presents a scheme to calculate the loop unrolling factor to fully utilize these patterns to generate efficient Single Instruction Multiple Data (SIMD) instructions. The loop nest is also considered for actual memory access patterns, which can be used to improve the efficiency of the compiler. We observe a performance improvement by an average factor of 12 times for manual experiments conducted on the TriMedia TM-1300 processor for the H.264 encoding application.
AB - Due to the rising complexity of modern embedded media applications (EMAs), compilers must have the capability to exploit the superword level parallelism (SLP). This work analyzes the memory access patterns found in EMAs and presents a scheme to calculate the loop unrolling factor to fully utilize these patterns to generate efficient Single Instruction Multiple Data (SIMD) instructions. The loop nest is also considered for actual memory access patterns, which can be used to improve the efficiency of the compiler. We observe a performance improvement by an average factor of 12 times for manual experiments conducted on the TriMedia TM-1300 processor for the H.264 encoding application.
UR - http://www.scopus.com/inward/record.url?scp=11244299431&partnerID=8YFLogxK
M3 - Conference article published in proceeding or book
AN - SCOPUS:11244299431
SN - 0780386035
SN - 9780780386037
T3 - 2004 IEEE International Conference on Multimedia and Expo (ICME)
SP - 1851
EP - 1854
BT - 2004 IEEE International Conference on Multimedia and Expo (ICME)
T2 - 2004 IEEE International Conference on Multimedia and Expo (ICME)
Y2 - 27 June 2004 through 30 June 2004
ER -