White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

A Machine Learning Framework for Optimising File Distribution Across Multiple Cloud Storage Services

Algarni, Abdullah Fayez H (2017) A Machine Learning Framework for Optimising File Distribution Across Multiple Cloud Storage Services. PhD thesis, University of York.

myThesis_August2017_3.pdf - Examined Thesis (PDF)
Available under License Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 UK: England & Wales.

Download (1682Kb) | Preview


Storing data using a single cloud storage service may lead to several potential problems for the data owner. Such issues include service continuity, availability, performance, security, and the risk of vendor lock-in. A promising solution is to distribute the data across multiple cloud storage services , similarly to the manner in which data are distributed across multiple physical disk drives to achieve fault tolerance and to improve performance . However, the distinguishing characteristics of different cloud providers, in term of pricing schemes and service performance, make optimising the cost and performance across many cloud storage services at once a challenge. This research proposes a framework for automatically tuning the data distribution policies across multiple cloud storage services from the client side, based on file access patterns. The aim of this work is to explore the optimisation of both the average cost per gigabyte and the average service performance (mainly latency time) on multiple cloud storage services . To achieve these aims, two machine learning algorithms were used: 1. supervised learning to predict file access patterns. 2. reinforcement learning to learn the ideal file distribution parameters. File distribution over several cloud storage services . The framework was tested in a cloud storage services emulator, which emulated a real multiple-cloud storage services setting (such as Google Cloud Storage, Amazon S3, Microsoft Azure Storage, and Rack- Space file cloud) in terms of service performance and cost. In addition, the framework was tested in various settings of several cloud storage services. The results of testing the framework showed that the multiple cloud approach achieved an improvement of about 42% for cost and 76% for performance. These findings indicate that storing data in multiple clouds is a superior approach, compared with the commonly used uniform file distribution and compared with a heuristic distribution method.

Item Type: Thesis (PhD)
Related URLs:
Academic Units: The University of York > Computer Science (York)
Identification Number/EthosID: uk.bl.ethos.722826
Depositing User: Mr Abdullah Fayez H Algarni
Date Deposited: 07 Sep 2017 09:06
Last Modified: 24 Jul 2018 15:22
URI: http://etheses.whiterose.ac.uk/id/eprint/17981

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Actions (repository staff only: login required)