Tang, Tianli ORCID: https://orcid.org/0000-0003-2182-6525
(2020)
Predicting boarding and alighting behaviour of bus passengers with smart card data using machine learning techniques.
PhD thesis, University of Leeds.
Abstract
Developing an efficient public transport is an important initiative to ease traffic congestion and to reduce energy usage and air pollution. In addition to a well-planned network, an advanced public transport system should offer a comfortable, safe and reliable service for passengers, which requires an appropriate strategy of management and operation. The development of smart infrastructure in public transport ticketing systems has not only improved the operation efficiency and enhanced passenger travel experience, but the development has also made available millions of passengers’ daily travel records. This valuable data source can be used to analyse passengers’ travel behaviour and travel demand, which in turn can help bus companies offer better public transport services for passengers.
This study mainly aims to understand and predict the boarding and alighting behaviour of bus passengers, from smart card records, using machine learning (ML) approaches. Firstly, a gradient boosting decision tree (GBDT) ML model is trained with features of passengers’ boarding records, their travel history, as well as weather conditions and travel history. The model is then applied to estimate the alighting stop for each smart card trips. Secondly, a multi-stage deep learning-based ML framework is developed which utilises the fully connected network, recurrent neural network (RNN) and long short-term memory (LSTM) network, to predict the hourly boarding behaviour (on whether to travel and which bus stop to use) for every smart card user. Thirdly, a deep generative adversarial network (Deep-GAN) is proposed to counter the issue with imbalanced data, where positive instances (in our case a boarding instance in any given hour) is much less than negative instances, and to improve the prediction on the hourly boarding demand from the smart card data.
The studies indicate that: i) ML techniques are an effective predictive tool to deal with multiple variable and non-linear relations; ii) including weather conditions and travel history can significantly improve the performance of predictive models; iii) the problem of innumerable classes of data fields and imbalanced data records significantly reduce the accuracy of predictive models; iv) In addition to providing good prediction power, GBDT-based ML models provide the ability to rank the relative importance of features; v) RNN and LSTM are capable of capturing the temporal characteristics (i.e. the peak hours) of passengers’ boarding behaviour; vi) Deep-GAN models can be effectively used for reducing the problem of data imbalance and enhancing the performance of the predictive models.
Metadata
Supervisors: | Liu, Ronghui and Choudhury, Charisma |
---|---|
Related URLs: | |
Keywords: | Bus ridership; Prediction; Machine learning; Smart card data |
Awarding institution: | University of Leeds |
Academic Units: | The University of Leeds > Faculty of Environment (Leeds) The University of Leeds > Faculty of Environment (Leeds) > Institute for Transport Studies (Leeds) |
Identification Number/EthosID: | uk.bl.ethos.829667 |
Depositing User: | Dr Tianli Tang |
Date Deposited: | 23 Apr 2021 08:05 |
Last Modified: | 11 May 2023 09:53 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:28696 |
Download
Final eThesis - complete (pdf)
Filename: Tang_T_ITS_PhD_2021.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial ShareAlike 4.0 International License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.