ABSTRACT
This article describes the steps we went through in designing and validating an item bank to diagnose linguistic problems in the English academic writing of university students in Hong Kong. Test items adopt traditional item formats (e.g., MCQ, grammatical judgment tasks, and error correction) but are based on authentic language materials extracted from a manually error-tagged corpus of target students’ essays. A total of 257 items were developed to assess 25 high-frequency and grave linguistic errors. To validate test items and calibrate their psychometric qualities, four parallel tests were assembled and given to 338 students. Rasch modeling was conducted to examine item dimensionality, DIF sizes, fit indices, reliability, and difficulty. The results supported the validity and reliability of the remaining 219 items in the bank. Moreover, we investigated the effects of item formats and target errors on item difficulty and item-measure correlations and the relations amongst error difficulty error frequency and prevalence. The item bank approach to developing diagnostic tests was found useful insofar as it provided more precise information about knowledge gaps than the corpus-based textual analysis and had the potential to pinpoint high priority areas for remedial instruction.
Disclosure statement
No potential conflict of interest was reported by the author.
Notes
1 Details of polarity analysis and DIF analysis of the anchor items will be provided upon request.
2 In the PCA of Rasch analysis, the total variance of scores is first partitioned into the variance that can be explained by the Rasch measure and that cannot be explained by the measure. The unexplained variance, which could be due to randomness or multidimensionality is further partitioned into components called contrasts. It is termed contrast because the substantive differences between items that load positively and negatively on the component may reflect a systematic second dimension. Different from conventional PCA or Factor analysis (for instance those conducted via SPSS), each contrast only has meaning when its two ends are contrasted, i.e. it has items or persons at both positive and negative ends. Moreover, usually only big contrasts with Eigenvalue larger than 2 (the size of an Eigenvalue by chance) are inspected to see the contrasting content of the items on its two ends (see Table 24.0 in WinSteps 3.70.11 manual, Linacre, Citation2007). In this analysis, only one pair of contrastive items was found on Contrast 5.