White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Fault Tolerant Task Mapping in Many-Core Systems

Bonney, Colin Andrew (2016) Fault Tolerant Task Mapping in Many-Core Systems. PhD thesis, University of York.

This is the latest version of this item.

[img]
Preview
Text
PhD.pdf - Examined Thesis (PDF)
Available under License Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 UK: England & Wales.

Download (22Mb) | Preview

Abstract

The advent of many-core systems, a network on chip containing hundreds or thousands of homogeneous processors cores, present new challenges in managing the cores effectively in response to processing demands, hardware faults and the need for heat management. Continually diminishing feature size of devices increase the probability of fabrication de- fects and the variability of performance of individual transistors. In many-core systems this can result in the failure of individual processing cores, routing nodes or communication links, which require the use of fault tolerant mechanisms. Diminishing feature size also increases the power density of devices, giving rise to the concept of dark silicon where only a portion of the functionality available on a chip can be active at any one time. Core fault tolerance and management of dark silicon can both be achieved by allocating a percentage of cores to be idle at any one time. Idle cores can be used as dark silicon to evenly distribute heat generated by processing cores and can also be used as spare cores to implement fault tolerance. Both of these can be achieved by the dynamic allocation of processes to tasks in response to changes to the status of hardware resources and the demands placed on the system, which in turn requires real time task mapping. This research proposes the use of a continuous fault/recovery cycle to implement graceful degradation and amelioration to provide real-time fault tolerance. Objective measures for core fault tolerance, link fault tolerance, network power and excess traffic have been developed for use by a multi-objective evolutionary algorithm that uses knowledge of the processing demands and hardware status to identify optimal task mappings. The fault/recovery cycle is shown to be effective in maintaining a high level of performance of a many-core array when presented with a series of hardware faults.

Item Type: Thesis (PhD)
Keywords: Many-core, fault tolerance, task mapping, evolutionary algorithm, Pareto Front, graceful degradation, graceful amelioration
Academic Units: The University of York > Electronics (York)
Depositing User: Mr Colin Andrew Bonney
Date Deposited: 23 Nov 2018 10:39
Last Modified: 23 Nov 2018 10:39
URI: http://etheses.whiterose.ac.uk/id/eprint/20899

Available Versions of this Item

  • Fault Tolerant Task Mapping in Many-Core Systems. (deposited 23 Nov 2018 10:39) [Currently Displayed]

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Actions (repository staff only: login required)