학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Competitiveness of a Non-Linear Block-Space GPU Thread Map for Simplex Domains

Resource Type: Periodical
Authors: Navarro, C.A.; Vernier, M.; Bustos, B.; Hitschfeld, N.
Source: IEEE Transactions on Parallel and Distributed Systems IEEE Trans. Parallel Distrib. Syst. Parallel and Distributed Systems, IEEE Transactions on. 29(12):2728-2741 Dec, 2018
Subject: Computing and Processing
Communication, Networking and Broadcast Technologies
Graphics processing units
Instruction sets
Symmetric matrices
Computer architecture
Optimization
Programming
GPU thread mapping
block-space
simplex domains
GPU optimization
Language
ISSN: 1045-9219
1558-2183
2161-9883

Online Access

초록

This work presents and studies the efficiency problem of mapping GPU threads onto simplex domains. A non-linear map $\lambda (\omega)$ is formulated based on a block-space enumeration principle that reduces the number of thread-blocks by a factor of approximately $2\times$ and $6\times$ for 2-simplex and 3-simplex domains, respectively, when compared to the standard approach. Performance results show that $\lambda (\omega)$ is competitive and even the fastest map when ran in recent GPU architectures such as the Tesla V100, where it reaches up to $1.5\times$ of speedup in 2-simplex tests. In 3-simplex tests, it reaches up to $2.3\times$ of speedup for small workloads and up to $1.25\times$ for larger ones. The results obtained make $\lambda (\omega)$ a useful GPU optimization technique with applications on parallel problems that define all-pairs, all-triplets or nearest neighbors interactions in a 2-simplex or 3-simplex domain.

공지

DAU Library

학술논문

요약정보

Competitiveness of a Non-Linear Block-Space GPU Thread Map for Simplex Domains

Online Access

초록