White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

How to even the score: an investigation into how native and Arab non-native teachers of English rate essays containing short and long sentences.

Ameer, Saleh (2017) How to even the score: an investigation into how native and Arab non-native teachers of English rate essays containing short and long sentences. PhD thesis, University of Sheffield.

[img]
Preview
Text
Official Saleh Ameer PhD thesis (PDF).pdf
Available under License Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 UK: England & Wales.

Download (5Mb) | Preview

Abstract

In the field of education, test scores are meant to provide an indication of test-takers’ knowledge or abilities. The validity of tests must be rigorously investigated to ensure that the scores obtained are meaningful and fair. Owing to the subjective nature of the scoring process, rater variation is a major threat to the validity of performance-based language testing (i.e., speaking and writing). This investigation explores the influence of two main effects on writing test scores using an analytic rating scale. The first main effect is that of raters’ first language (native and non-native). The second is the average length of sentences (essays with short sentences and essays with long sentences). The interaction between the main effects will also be analyzed. Sixty teachers of English as a second or foreign language (30 natives and 30 non-natives) working in Kuwait, used a 9-point analytic rating scale with four criteria to rate 24 essays with contrasting average sentence length (12 essays with short sentences on average and 12 with long sentences). Multi-Facet Rasch Measurement (using FACETS program, version 3.71.4) showed that: (1) the overall scores awarded by raters differed significantly in severity; (2) there were a number of significant bias interactions between raters’ first language and the essays' average sentence length; (3) the native raters generally overestimated the essays with short sentences by awarding higher scores than expected, and underestimated the essays with long sentences by awarding lower scores than expected. The non-natives displayed the reverse pattern. This was shown on all four criteria of the analytic rating scale. Furthermore, there was a significant interaction between raters and criteria, especially the criterion 'Grammatical range and accuracy'. Two sets of interviews were subsequently carried out. The first set had many limitations and its findings were not deemed adequate. The second set of interviews showed that raters were not influenced by sentence length per se, but awarded scores that were higher/lower than expected mainly due to the content and ideas, paragraphing, and vocabulary. This focus is most likely a result of the very problematic writing assessment scoring rubric of the Ministry of Education-Kuwait. The limitations and implications of this investigation are then discussed.

Item Type: Thesis (PhD)
Academic Units: The University of Sheffield > Faculty of Arts and Humanities (Sheffield) > School of English (Sheffield)
Depositing User: Mr Saleh Ameer
Date Deposited: 26 May 2017 13:59
Last Modified: 26 May 2017 14:00
URI: http://etheses.whiterose.ac.uk/id/eprint/17361

Actions (repository staff only: login required)