An efficient sorting algorithm with CUDA

Shifu Chen, Jing Qin, Yongming Xie, Junping Zhao, Pheng Ann Heng

Research output: Journal article publicationJournal articleAcademic researchpeer-review

10 Citations (Scopus)


An efficient GPU-based sorting algorithm is proposed in this paper together with a merging method on graphics devices. The proposed sorting algorithm is optimized for modern GPU architecture with the capability of sorting elements represented by integers, floats and structures, while the new merging method gives a way to merge two ordered lists efficiently on GPU without using the slow atomic functions and uncoalesced memory read. Adaptive strategies are used for sorting disorderly or nearlysorted lists, large or small lists. The current implementation is on NVIDIA CUDA with multi-GPUs support, and is being migrated to the new born Open Computing Language (OpenCL). Extensive experiments demonstrate that our algorithm has better performance than previous GPU-based sorting algorithms and can support real-time applications.
Original languageEnglish
Pages (from-to)915-921
Number of pages7
JournalJournal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A/Chung-kuo Kung Ch'eng Hsuch K'an
Issue number7
Publication statusPublished - 1 Jan 2009
Externally publishedYes


  • CUDA
  • Parallel merging
  • Parallel sorting

ASJC Scopus subject areas

  • Engineering(all)


Dive into the research topics of 'An efficient sorting algorithm with CUDA'. Together they form a unique fingerprint.

Cite this