Abstract
An efficient GPU-based sorting algorithm is proposed in this paper together with a merging method on graphics devices. The proposed sorting algorithm is optimized for modern GPU architecture with the capability of sorting elements represented by integers, floats and structures, while the new merging method gives a way to merge two ordered lists efficiently on GPU without using the slow atomic functions and uncoalesced memory read. Adaptive strategies are used for sorting disorderly or nearlysorted lists, large or small lists. The current implementation is on NVIDIA CUDA with multi-GPUs support, and is being migrated to the new born Open Computing Language (OpenCL). Extensive experiments demonstrate that our algorithm has better performance than previous GPU-based sorting algorithms and can support real-time applications.
Original language | English |
---|---|
Pages (from-to) | 915-921 |
Number of pages | 7 |
Journal | Journal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A/Chung-kuo Kung Ch'eng Hsuch K'an |
Volume | 32 |
Issue number | 7 |
DOIs | |
Publication status | Published - 1 Jan 2009 |
Externally published | Yes |
Keywords
- CUDA
- Parallel merging
- Parallel sorting
ASJC Scopus subject areas
- General Engineering