Parallel Performance Studies for an Elliptic Test Problem on the Cluster tara

dc.contributor.authorRaim, Andrew M.
dc.contributor.authorGobbert, Matthias K.
dc.date.accessioned2018-10-25T14:08:42Z
dc.date.available2018-10-25T14:08:42Z
dc.date.issued2010
dc.description.abstractThe performance of parallel computer code depends on an intricate interplay of the processors, the architecture of the compute nodes, their interconnect network, the numerical algorithm, and its implementation. The solution of large, sparse, highly structured systems of linear equations by an iterative linear solver that requires communication between the parallel processes at every iteration is an instructive test of this interplay. This note considers the classical elliptic test problem of a Poisson equation with Dirichlet boundary conditions in two spatial dimensions, whose approximation by the finite difference method results in a linear system of this type. Our existing implementation of the conjugate gradient method for the iterative solution of this system is known to have the potential to perform well up to many parallel processes, provided the interconnect network has low latency. Since the algorithm is known to be memory bound, it is also vital for good performance that the architecture of the nodes in conjunction with the scheduling policy does not create a bottleneck. The results presented here show excellent performance on the cluster tara with up to 512 parallel processes when using 64 compute nodes. The results support the scheduling policy implemented, since they confirm that it is beneficial to use all eight cores of the two quad-core processors on each node simultaneously, giving us in effect a computer that can run jobs efficiently with up to 656 parallel processes when using all 82 compute nodes. The cluster tara is an IBM Server x iDataPlex purchased in 2009 by the UMBC High Performance Computing Facility (www.umbc.edu/hpcf). It is an 86-node distributed-memory cluster comprised of 82 compute, 2 develop, 1 user, and 1 management nodes. Each node features two quad-core Intel Nehalem X5550 processors (2.66 GHz, 8 MB cache), 24 GB memory, and a 120 GB local hard drive. All nodes and the 160 TB central storage are connected by an InfiniBand (QDR) interconnect network.en_US
dc.description.urihttps://userpages.umbc.edu/~gobbert/papers/RaimGobbert2010Poisson.pdfen_US
dc.format.extent20 pagesen_US
dc.genretechnical reporten_US
dc.identifierdoi:10.13016/M2WH2DJ56
dc.identifier.urihttp://hdl.handle.net/11603/11683
dc.language.isoen_USen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Mathematics Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofseriesHPCF Technical Report;HPCF–2010–2
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subjectParallel performance studyen_US
dc.subjectElliptic test problemen_US
dc.subjectPoisson equation with Dirichlet boundary conditions in two spatial dimensions,en_US
dc.subjectUMBC High Performance Computing Facility (HPCF)en_US
dc.subjectconjugate gradient method
dc.titleParallel Performance Studies for an Elliptic Test Problem on the Cluster taraen_US
dc.typeTexten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
RaimGobbert2010Poisson.pdf
Size:
3.96 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.68 KB
Format:
Item-specific license agreed upon to submission
Description: