Parallel Performance Studies for an Elliptic Test Problem on the Stampede2 Cluster and Comparison of Networks

Author/Creator ORCID

Date

2018

Department

Program

Citation of Original Publication

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

We study the parallel performance of dual-socket compute nodes with Intel Xeon Platinum 8160 Skylake CPUs with 24 cores and 192 GB of memory, connected by a 100 Gbps Intel Omni-Path (OPA) interconnect. The experimenets use the classical test problem of a Poisson equation in two spatial dimensions, discretized by the finite difference method to give a very large and sparse system of linear equations that is solved by the conjugate gradient method. The tests are performanced on the Skylake nodes of Stampede2 in the Texas Advanced Computing Center (TACC) at The University of Texas at Austin. This national supercomputer is funded by National Science Foundation (NSF) and can be accessed through the XSEDE program. We also compare the performance of the test code using different inter-node networks, Omni-Path (OPA), InfiniBand (IB), and Ethernet, on test clusters graciously provided to us by Dell. The results demonstrate excellent scalability when using more nodes due to the low latency of the high-performance interconnect and good speedup when using all cores of the multi-core CPUs. Comparison to past results brings out that core per core performance improvements have stalled, but that node per node performance continues to improve due to the larger number of cores available on a node.