Concurrent Solutions to Linear Systems using Hybrid CPU/GPU Nodes

Author/Creator ORCID

Date

2015-06-09

Department

Program

Citation of Original Publication

Oluwapelumi Adenikinju, Julian Gilyard, Joshua Massey, Thomas Stitt, Matthias K. Gobbert, Concurrent Solutions to Linear Systems using Hybrid CPU/GPU Nodes, SIAM Undergraduate Research Online (SIURO), Volume 8, http://dx.doi.org/10.1137/15S013776

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

We investigate the parallel solutions to linear systems with the application focus as the global illumination problem in computer graphics. An existing CPU serial implementation using the radiosity method is given as the performance baseline where a scene and corresponding form-factor coeffcients are provided. The initial computational radiosity solver uses the basic Jacobi method with a fixed iteration count as an iterative approach to solving the radiosity linear system. We add the option of using the modern BiCG-STAB method with the aim of reduced runtime for complex problems. It is found that for the test scenes used, the problem complexity was not great enough to take advantage of mathematical reformulation through BiCG-STAB. Single-node parallelization techniques are implemented through OpenMP-based multi- threading, GPU-offloading using CUDA, and hybrid multi-threading/GPU offloading. It is seen that in general OpenMP is optimal by requiring no expensive memory transfers. Finally, we investigate two storage schemes of the system to determine whether storage through arrays of structures or structures of arrays results in better performance. We nd that the usage of arrays of structures in conjunction with OpenMP results in the best performance except for small scene sizes, where CUDA shows the minimal runtime.