Offorha, Bright Chiemezie ORCID: https://orcid.org/0000-0003-2489-3969 (2023) A comparison of statistical methods for analysing cluster randomised controlled trials: classical vs emerging methods. PhD thesis, University of Sheffield.
Abstract
Introduction
Cluster randomised controlled trials (cRCTs) entails randomising groups of individuals, such as schools, care homes, hospital wards, and general practices to the treatment arms. The outcomes within a cluster are likely to be correlated. The chosen analytical approach must consider this correlation to obtain valid results. Ignoring the correlated outcomes by using standard statistical methods that treat the outcomes as being independent, may lead to underestimating the standard errors (SEs) of the parameter estimates and consequently obtaining narrower confidence intervals (CIs), false small P-values, and incorrectly overstating the effect of the intervention. The following research question were conceptualised to explore the statistical methods used to analyse outcome data from cRCTs 1) What are the appropriate, and available methods in the literature for analysing outcome data from cRCTs? 2) What statistical methods are used in practice for analysing outcome data from cRCTs? 3) What criteria should be used in deciding the appropriateness of the identified methods? 4) How well do the selected methods perform, when compared?
Methods
I conducted a methodological scoping review involving a systematic search of the online bibliography databases of MEDLINE, EMBASE, PsycINFO, CINAHL, and SCOPUS and a practice review involving a chronological search of the online table of contents of the National Institute for Health and Care Research (NIHR) Journal Library to identify gaps in knowledge, and four analytical approaches (GzLMM, GEE1, GEE2, and QIF) were identified. The methods were applied to four cRCT datasets with continuous and binary outcomes. Furthermore, three of the four methods (GzLMM, GEE1, QIF) were applied to simulated continuous outcome datasets.
Results
The methodological review identified 27 unique analytical methods. In the practice review, from the 79 included cRCT reports with 86 independent trials and 100 primary outcomes analysed the observed median intracluster correlation coefficient (ICC) was 0.02, of which 4 in 10 trials did not report the observed ICC. This act goes against the recommendations of the Consolidated Standards of Reporting Trials (CONSORT) reporting guidelines. The analysis of the four example datasets with clusters ranging from 10 to 100, and individual participants ranging from 748 to 9,207 showed that the estimates of the treatment effect (and associated standard error, confidence interval, and P-value) from the methods were equivalent in most cases. However, in a few analyses, the QIF produced different results compared to the other three methods, especially in trials with small to moderate numbers of clusters. The estimates from GEE1 and GEE2 were the same, except in their estimates of the ICC, hence, GEE2 was dropped from further investigations. A simulation study involving continuous outcome data shows that GzLMM, GEE1, and QIF performed equivalently based on bias, empirical standard error, and mean square error. The number of clusters N, cluster sizes n_i, ICC ρ and effect sizes θ had no impact on these results. With regards to coverage, Type I error rate, and power, GzLMM (with identity link function and parameters estimated by MLE) performed better than GEE1 and QIF when the ICC is low. For moderate ICC, appropriate small sample correction should be applied in conjunction with the chosen method when the clusters are fewer than fifty.
Conclusions
The planning of cRCTs should consider the hierarchical nature of cRCT design in the sample size calculation. Adherence to the reporting guidelines of CONSORT with extension to cRCTs is suboptimal based on the reporting of the observed ICC. Researchers, peer reviewers, and editors should make efforts to improve on this. In most cases investigated, the GzLMM performed better than GEE1 and QIF, however, other factors should be considered in choosing the appropriate analytical method, such as the estimand and scientific question of interest. QIF have no advantage over GEE1, hence, the current practice should be maintained.
Metadata
Supervisors: | Walters, Stephen J. and Jacques, Richard M. |
---|---|
Related URLs: |
|
Awarding institution: | University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Medicine, Dentistry and Health (Sheffield) The University of Sheffield > Faculty of Medicine, Dentistry and Health (Sheffield) > School of Health and Related Research (Sheffield) |
Academic unit: | Faculty of Health, Division of Population Health, School of Medicine and Population Health |
Depositing User: | MR Bright Chiemezie Offorha |
Date Deposited: | 07 May 2024 10:27 |
Last Modified: | 07 May 2024 10:27 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:34738 |
Download
Final eThesis - complete (pdf)
Embargoed until: 7 May 2025
Please use the button below to request a copy.
Filename: Offorha_Bright_1903098207_AMENDED_CLEAN.pdf
Export
Statistics
Please use the 'Request a copy' link(s) in the 'Downloads' section above to request this thesis. This will be sent directly to someone who may authorise access.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.