Detection and GPU accelerationof 3D FDTD algorithms based on memory access patterns
- Resource Type
- Conference
- Authors
- Ran Shao; Linton, David; Spence, Ivor; Milligan, Peter; Ning Zheng
- Source
- Proceedings 2013 International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC) Mechatronic Sciences, Electric Engineering and Computer (MEC), Proceedings 2013 International Conference on. :2520-2526 Dec, 2013
- Subject
- Aerospace
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Finite difference methods
Time-domain analysis
Graphics processing units
Memory management
System-on-chip
Acceleration
Three-dimensional displays
FDTD
Memory access pattern
LLVM
CUDA 5.0
- Language
A semi-automatic tool is reported that first analyzes the sequential FDTD program to obtain memory access patterns and related features, and then optimizes the FDTD program with combined use of several types of CUDA memory on both Fermi and Kepler architecture GPUs. The experiments show a 13% and 18% speedup using Fermi and Kepler GPUs respectively compared to the GPU version program without optimization. Up to 142 times speedup is achieved compared to the sequential FDTD C program at a FDTD 3D mesh size of 250* 250* 250 (15.625 million mesh cells) with 10 layers CPML boundary conditions in 4096 time steps.