White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

The definition of the relevant population and the collection of data for likelihood ratio-based forensic voice comparison

Hughes, Vincent (2014) The definition of the relevant population and the collection of data for likelihood ratio-based forensic voice comparison. PhD thesis, University of York.

Hughes, V. (2014) PhD.pdf
Available under License Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 UK: England & Wales.

Download (19Mb) | Preview


Within the field of forensic speech science there is increasing acceptance of the likelihood ratio (LR) as the logically and legally correct framework for evaluating forensic voice comparison (FVC) evidence. However, only a small proportion of experts currently use the numerical LR in casework. This is due primarily to the difficulties involved in accounting for the inherent, and arguably unique, complexity of speech in a fully data-driven, numerical LR analysis. This thesis addresses two such issues: the definition of the relevant population and the amount of data required for system testing. Firstly, experiments are presented which explore the extent to which LRs are affected by different definitions of the relevant population with regard to sources of systematic sociolinguistic between-speaker variation (regional background, socio-economic class and age) using both linguistic-phonetic and ASR variables. Results show that different definitions of the relevant population can have a substantial effect on the magnitude of LRs, depending on the input variable. However, system validity results suggest that narrow controls over sociolinguistic sources of variation should be preferred to general controls. Secondly, experiments are presented which evaluate the effects of development, test and reference sample size on LRs. Consistent with general principles in statistics, more precise results are found using more data across all experiments. There is also considerable evidence of a relationship between sample size sensitivity and the dimensionality and speaker discriminatory power of the input variable. Further, there are potential trade-offs in the size of each set depending on which element of LR output the analyst is interested in. The results in this thesis will contribute towards improving the extent to which LR methods account for the linguistic-phonetic complexity of speech evidence. In accounting for this complexity, this work will also increase the practical viability of applying the numerical LR to FVC casework.

Item Type: Thesis (PhD)
Academic Units: The University of York > Language and Linguistic Science (York)
Identification Number/EthosID: uk.bl.ethos.640707
Depositing User: Mr Vincent Hughes
Date Deposited: 26 Mar 2015 11:27
Last Modified: 08 Sep 2016 13:32
URI: http://etheses.whiterose.ac.uk/id/eprint/8309

You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.

Actions (repository staff only: login required)