Chan, Yu (2016) A Distributed Stream Library for Java 8. PhD thesis, University of York.
Abstract
An increasingly popular application of parallel computing is Big Data, which concerns the
storage and analysis of very large datasets. Many of the prominent Big Data frameworks
are written in Java or JVM-based languages. However, as a base language for Big Data
systems, Java still lacks a number of important capabilities such as processing very large
datasets and distributing the computation over multiple machines. The introduction of
Streams in Java 8 has provided a useful programming model for data-parallel computing,
but it is limited to a single JVM and still does not address Big Data issues.
This thesis contends that though the Java 8 Stream framework is inadequate to support
the development of Big Data applications, it is possible to extend the framework to achieve
performance comparable to or exceeding those of popular Big Data frameworks. It first
reviews a number of Big Data programming models and gives an overview of the Java 8
Stream API. It then proposes a set of extensions to allow Java 8 Streams to be used in Big
Data systems. It also shows how the extended API can be used to implement a range of
standard Big Data paradigms. Finally, it compares the performance of such programs with
that of Hadoop and Spark. Despite being a proof-of-concept implementation, experimental
results indicate that it is a lightweight and efficient framework, comparable in performance
to Hadoop and Spark.
Metadata
Supervisors: | Wellings, Andy and Gray, Ian |
---|---|
Awarding institution: | University of York |
Academic Units: | The University of York > Computer Science (York) |
Depositing User: | Yu Chan |
Date Deposited: | 22 Nov 2016 14:16 |
Last Modified: | 22 Nov 2016 14:16 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:15510 |
Download
Examined Thesis (PDF)
Filename: phd.pdf
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.