Some Workload Scheduling Alternatives in a High Performance Computing Environment

Author/Creator ORCID

Date

Department

Program

Citation of Original Publication

Rights

This work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
Public Domain Mark 1.0

Subjects

Abstract

Clusters of commodity microprocessors have overtaken custom-designed systems as the high performance computing (HPC) platform of choice. The design and optimization of workload scheduling systems for clusters has been an active research area. This paper surveys some examples of workload scheduling methods used in large-scale applications such as Google, Yahoo, and Amazon that use a MapReduce parallel processing framework. It examines a specific MapReduce framework, Hadoop, in some detail. It describes a novel dynamic prioritization, self-tuning workload scheduler, and provides simulation results that suggest the approach will improve performance compared to standard Hadoop scheduling.