White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Fair, responsive scheduling of engineering workflows on computing grids

Burkimsher, Andrew Marc (2014) Fair, responsive scheduling of engineering workflows on computing grids. EngD thesis, University of York.

[img]
Preview
Text
ThesisFinalPrintE.pdf
Available under License Creative Commons Attribution 2.0 UK: England & Wales.

Download (4Mb) | Preview

Abstract

This thesis considers scheduling in the context of a grid computing system used in engineering design. Users desire responsiveness and fairness in the treatment of the workflows they submit. Submissions outstrip the available computing capacity during the work day, and the queue is only caught up on overnight and at weekends. The execution times observed span a wide range of 10^0 to 10^7 core-minutes. The Projected Schedule Length Ratio (P-SLR) list scheduling policy is designed to use execution time estimates and the structure of the dependency graph to improve on the existing industrial FairShare policy. P-SLR aims to minimise the worst-case SLR of jobs and keep SLR fair across the space of job execution times. P-SLR is shown to equal or surpass all other evaluated policies in responsiveness and fairness across the spectra of load and networking delays. P-SLR is also dominant where execution time estimates are within an order of magnitude of the real value. Such estimates are considered achievable using user knowledge or automated profiling. Outside this range, the Shortest Remaining Time First (SRTF) policy achieved better responsiveness and fairness. The Projected Value Remaining (PVR) policy considers the case where a curve specifying the value of a job over time is given. PVR aims to maximise total workload value, even under overload, by maximising the worst-case job value in a workload. PVR is shown to be dominant across the load and networking spectra. Where execution time estimates are coarser than the nearest power of 2, SRTF delivers higher value than PVR. SRTF is also shown to have responsiveness, fairness and value close behind P-SLR and PVR throughout the range of load and network delays considered. However, the kinds of starvation under overload incurred by SRTF would almost certainly be undesirable if implemented in a production system.

Item Type: Thesis (EngD)
Academic Units: The University of York > Computer Science (York)
Identification Number/EthosID: uk.bl.ethos.635427
Depositing User: Mr. Andrew Marc Burkimsher
Date Deposited: 17 Feb 2015 15:12
Last Modified: 08 Sep 2016 13:32
URI: http://etheses.whiterose.ac.uk/id/eprint/8080

Actions (repository staff only: login required)