TY - GEN
T1 - An efficient method for optimizing PETSc on the sunway taihulight system
AU - Kang, Letian
AU - Wang, Zhi Jie
AU - Quan, Zhe
AU - Wu, Weigang
AU - Guo, Song
AU - Li, Kenli
AU - Li, Keqin
PY - 2018/12/4
Y1 - 2018/12/4
N2 - High performance computing platforms can bring us great benefits on processing various ubiquitous computing tasks. The Sunway TaihuLight supercomputer is a novel high performance computing platform, which is ranked No. 1 among the TOP500 list in the world. In this paper, we focus on how to optimize the Portable and Extensible Toolkit for Scientific computation (PETSc), running on supercomputers. The main motivations for this study are twofold: (i) PETSc is widely and frequently used in many scientific research fields such as biology, fusion, artificial intelligence, geosciences, etc; and (ii) the current nuclear PETSc does not fully utilize the potential of the Sunway TaighLight system, especially its powerful processor, i.e., SW26010 processor. To achieve high efficiency of PETSc, the central idea of our optimizations is to fully promote the performance of time-consuming and frequently used computation components (e.g., matrix and vector modules). To this end, we propose (i) accelerating kernel codes with computing processing elements (CPEs), in which new compression format and targeted optimizations for vector and matrix operations are devised; and (ii) using more efficient memory access schemes. We have implemented our proposals and evaluated its effectiveness and efficiency through a real world application - Structural Finite Element Analysis (SFEA). We obtain 16~32 times speedup for a single SW26010 processor. As an extra finding, the results also show a high scalability on over 8,000 computing nodes, i.e., 532,500 cores.
AB - High performance computing platforms can bring us great benefits on processing various ubiquitous computing tasks. The Sunway TaihuLight supercomputer is a novel high performance computing platform, which is ranked No. 1 among the TOP500 list in the world. In this paper, we focus on how to optimize the Portable and Extensible Toolkit for Scientific computation (PETSc), running on supercomputers. The main motivations for this study are twofold: (i) PETSc is widely and frequently used in many scientific research fields such as biology, fusion, artificial intelligence, geosciences, etc; and (ii) the current nuclear PETSc does not fully utilize the potential of the Sunway TaighLight system, especially its powerful processor, i.e., SW26010 processor. To achieve high efficiency of PETSc, the central idea of our optimizations is to fully promote the performance of time-consuming and frequently used computation components (e.g., matrix and vector modules). To this end, we propose (i) accelerating kernel codes with computing processing elements (CPEs), in which new compression format and targeted optimizations for vector and matrix operations are devised; and (ii) using more efficient memory access schemes. We have implemented our proposals and evaluated its effectiveness and efficiency through a real world application - Structural Finite Element Analysis (SFEA). We obtain 16~32 times speedup for a single SW26010 processor. As an extra finding, the results also show a high scalability on over 8,000 computing nodes, i.e., 532,500 cores.
KW - High Performance Computing
KW - PETSc
KW - SW26010 processor
KW - TanhuLight supercomputer
UR - http://www.scopus.com/inward/record.url?scp=85060297922&partnerID=8YFLogxK
U2 - 10.1109/SmartWorld.2018.00115
DO - 10.1109/SmartWorld.2018.00115
M3 - Conference article published in proceeding or book
AN - SCOPUS:85060297922
T3 - Proceedings - 2018 IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovations, SmartWorld/UIC/ATC/ScalCom/CBDCom/IoP/SCI 2018
SP - 538
EP - 545
BT - Proceedings - 2018 IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovations, SmartWorld/UIC/ATC/ScalCom/CBDCom/IoP/SCI 2018
A2 - Loulergue, Frederic
A2 - Wang, Guojun
A2 - Bhuiyan, Md Zakirul Alam
A2 - Ma, Xiaoxing
A2 - Li, Peng
A2 - Roveri, Manuel
A2 - Han, Qi
A2 - Chen, Lei
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE SmartWorld, 15th IEEE International Conference on Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovations, SmartWorld/UIC/ATC/ScalCom/CBDCom/IoP/SCI 2018
Y2 - 7 October 2018 through 11 October 2018
ER -