Copyrights to these papers may be held by the publishers. The download files are preprints. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Mark Blanco, Tze-Meng Low and Kyungjoo Kim (Proc. High Performance Extreme Computing (HPEC), 2019)
Exploration of Fine-Grained Parallelism for Load Balancing Eager K-truss on GPU and CPU
Preprint (735 KB)
In this work we present a performance exploration on Eager K-truss, a linear-algebraic formulation of the K-truss graph algorithm. We address performance issues related to load imbalance of parallel tasks in symmetric, triangular graphs by presenting a fine-grained parallel approach to executing the support computation. This approach also increases available parallelism, making it amenable to GPU execution. We demonstrate our fine-grained parallel approach using implementations in Kokkos and evaluate them on an Intel Skylake CPU and an Nvidia Tesla V100 GPU. Overall, we observe between a 1.26- 1.48x improvement on the CPU and a 9.97-16.92x improvement on the GPU due to our fine-grained parallel formulation.Keywords: Linear algebra, Parallel processing, CPUs, High performance, GPUs, Performance portable, Algorithm, K-truss, Graph-algorithms, Kokkos, Eager K-truss