Alharbi, Amirah Mohammed (2019) Unsupervised Abstraction for Reducing the Complexity of Healthcare Process Models. PhD thesis, University of Leeds.
Abstract
Healthcare processes are complex and may vary considerably among the same cohort of patients.
Process mining techniques play a significant role in automating the construction of healthcare
models using a system's event log. An event log is a data type that records any event that
occurs within the process. It is a basic element of any information system and has three main
components: process instance id, event and time when an event has occurred. Using ordinary
techniques of process mining in healthcare produces `spaghetti-like' models which are difficult
to understand and thus have little value. Previously published studies have highlighted the importance
of event abstraction which is considered as a central tool for reducing complexity and
improving efficiency. Although studies have successfully improved the understandability of process
models, they have generally relied on involvement from a domain expert. Untangling these
`spaghetti-like' models with the help of domain experts can be expensive and time-consuming.
Machine learning techniques such as Hidden Markov Model (HMM) has been used for modelling
sequential data for a long time. State transition modelling has also been explored by process
mining research and is advocated for sequence clustering purposes where a model is trained
over a group of sequences and then used to evaluate if a process instance is more likely to be
generated from this model or not. However, state transition models can also be utilised for detecting
hidden processes which can be used subsequently for process abstraction. In this thesis,
we aim to address healthcare process complexity using unsupervised abstraction. We adopt
an unsupervised method for detecting hidden processes using HMM and the Viterbi algorithm.
The method in this research includes eight stages; event logs extraction, preprocessing, learning,
decoding, optimisation, selection, visualisation and lastly model evaluation. One of the main
contributions of this research is the design of two different types of process model optimisation
which are strict and soft optimisations. Models that are selected by the proposed optimisation
address the limitations of other standard metrics that can be used for model selection in
HMM such as Bayesian Information criteria (BIC). Two different real healthcare data sources
are used in this research namely the Medical Information Mart for Intensive Care (MIMIC-III)
from Boston, USA and the Patients Pathway Manager (PPM) from Leeds, UK. Models are
trained using the MIMIC-III medical event log and then tested using the PPM dataset to be
evaluated later by a domain expert. Three breast cancer case studies that range in complexity
are extracted. The results of our method have significantly improved model complexity
and provided a conceptually valid abstraction for several care patterns. Promising results are
demonstrated in the improvement of the precision and fitness of the abstracted models. The
abstracted models can then be used as a middle step for bringing structure to unstructured
processes which helps in finding cohorts of patients based on similar healthcare processes. The
healthcare processes of a cohort of patients can then be modelled using any process mining tool
where their process similarity could not be captured in the complex models.
Metadata
Supervisors: | Bulpitt, Andy and Johnson, Owen |
---|---|
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds) |
Identification Number/EthosID: | uk.bl.ethos.789456 |
Depositing User: | Mrs Amirah Alharbi |
Date Deposited: | 05 Nov 2019 09:41 |
Last Modified: | 18 Feb 2020 12:51 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:25103 |
Download
Final eThesis - complete (pdf)
Filename: Final Thesis after correction.pdf
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.