White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Improving Software Remodularisation

Hall, Mathew J (2013) Improving Software Remodularisation. PhD thesis, University of Sheffield.

Available under License Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 UK: England & Wales.

Download (9Mb)


Maintenance is estimated to be the most expensive stage of the software development lifecycle. While documentation is widely considered essential to reduce the cost of maintaining software, it is commonly neglected. Auto- mated reverse engineering tools present a potential solution to this problem by allowing documentation, in the form of models, to be produced cheaply. State machines, module dependency graphs (MDGs), and other software models may be extracted automatically from software using reverse engineering tools. However the models are typically large and complex due to a lack of abstraction. Solutions to this problem use transformations (state machines) or “remodularisation” (MDGs) to enrich the diagram with a hierarchy to uncover the system’s structure. This task is complicated by the subjectivity of the problem. Automated techniques aim to optimise the structure, either through design quality metrics or by grouping elements by the limited number of available features. Both of these approaches can lead to a mismatch between the algorithm’s output and the developer’s intentions. This thesis addresses the problem from two perspectives: firstly, the improvement of automated hierarchy generation to the extent possible, and then augmentation using additional expert knowledge in a refinement process. Investigation begins on the application of remodularisation to the state machine hierarchy generation problem, which is shown to be feasible, due to the common underlying graph structure present in both MDGs and state machines. Following this success, genetic programming is investigated as a means to improve upon this result, which is found to produce hierarchies that better optimise a quality metric at higher levels. The disparity between metric-maximising performance and human-acceptable performance is then examined, resulting in the SUMO algorithm, which in- corporates domain knowledge to interactively refine a modularisation. The thesis concludes with an empirical user study conducted with 35 participants, showing, while its performance is highly dependent on the individual user, SUMO allows a modularisation of a 122 file component to be refined in a short period of time (within an hour for most participants).

Item Type: Thesis (PhD)
Keywords: remodularisation, clustering, genetic algorithms, constraint solving, metaheuristic algorithms, reverse engineering, software engineering
Academic Units: The University of Sheffield > Faculty of Engineering (Sheffield) > Computer Science (Sheffield)
The University of Sheffield > Faculty of Science (Sheffield) > Computer Science (Sheffield)
Identification Number/EthosID: uk.bl.ethos.577421
Depositing User: Mr Mathew J Hall
Date Deposited: 24 Jul 2013 11:08
Last Modified: 03 Oct 2016 10:45
URI: http://etheses.whiterose.ac.uk/id/eprint/4183

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Actions (repository staff only: login required)