White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

A Distributed Stream Library for Java 8

Chan, Yu (2016) A Distributed Stream Library for Java 8. PhD thesis, University of York.

This is the latest version of this item.

[img]
Preview
Text
phd.pdf - Examined Thesis (PDF)
Available under License Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 UK: England & Wales.

Download (1122Kb) | Preview

Abstract

An increasingly popular application of parallel computing is Big Data, which concerns the storage and analysis of very large datasets. Many of the prominent Big Data frameworks are written in Java or JVM-based languages. However, as a base language for Big Data systems, Java still lacks a number of important capabilities such as processing very large datasets and distributing the computation over multiple machines. The introduction of Streams in Java 8 has provided a useful programming model for data-parallel computing, but it is limited to a single JVM and still does not address Big Data issues. This thesis contends that though the Java 8 Stream framework is inadequate to support the development of Big Data applications, it is possible to extend the framework to achieve performance comparable to or exceeding those of popular Big Data frameworks. It first reviews a number of Big Data programming models and gives an overview of the Java 8 Stream API. It then proposes a set of extensions to allow Java 8 Streams to be used in Big Data systems. It also shows how the extended API can be used to implement a range of standard Big Data paradigms. Finally, it compares the performance of such programs with that of Hadoop and Spark. Despite being a proof-of-concept implementation, experimental results indicate that it is a lightweight and efficient framework, comparable in performance to Hadoop and Spark.

Item Type: Thesis (PhD)
Academic Units: The University of York > Computer Science (York)
Depositing User: Yu Chan
Date Deposited: 22 Nov 2016 14:16
Last Modified: 22 Nov 2016 14:16
URI: http://etheses.whiterose.ac.uk/id/eprint/15510

Available Versions of this Item

  • A Distributed Stream Library for Java 8. (deposited 22 Nov 2016 14:16) [Currently Displayed]

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Actions (repository staff only: login required)