Bibliografía
- Attali, Y., W. L. Lewis y M. Steier. 2012. “Scoring with the Computer: Alternative Procedures for Improving the Reliability of Holistic Essay Scoring”. Language Testing 30 (1): 125–141.
- Bachman, L. y A. Palmer. 2010. Language Assessment in Practice: Developing Language Assessments and Justifying their Use in the Real World. Oxford: Oxford University Press.
- Barkaoui, K. 2007. “Rating Scale Impact on EFL Essay Marking: A Mixed-Method Study”. Assessing Writing 12 (2): 86–107.
- Barkaoui, K. 2010a. “Variability in ESL Essay Rating Processes: The Role of the Rating Scale and Rater Experience”. Language Assessment Quarterly 7: 54–74.
- Barkaoui, K. 2010b. “Explaining ESL Essay Holistic Scores: A Multilevel Modeling Approach”. Language Testing 27 (4): 515–535.
- Barkaoui, K. 2011. “Think-Aloud Protocols in Research on Essay Rating: An Empirical Study of their Veridicality and Reactivity”. Language Testing 28 (1): 51–75.
- Becker, A. 2011. “Examining Rubrics Used to Measure Writing Performance in U.S. Intensive English Programs”. The CATESOL Journal 22 (1): 113–130.
- Bejar, I. 2012. “Rater Cognition: Implications for Validity”. Educational Measurement: Issues and Practice 31 (3): 2–9.
- Canale, M. y M. Swain. 1980. “Theoretical Bases of Communicative Approaches to Second Language Teaching and Testing”. Applied Linguistics 1 (1): 1–47.
- Cheng, C., H. C. Yang, H. Wen-Chin y S. Gwo-Ji. 2017. “Learning, Behaviour and Reaction Framework: A Model for Training Raters to Improve Assessment Quality”. Assessment & Evaluation in Higher Education 42 (5): 705–723.
- Consejo de Europa. 2001. Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge: Press Syndicate of the University of Cambridge.
- Cumming, A. 2013. “Assessing Integrated Writing Tasks for Academic Purposes: Promises and Perils”. Language Assessment Quarterly 10 (1): 1–8.
- Deane, P. 2013. “On the Relation between Automated Essay Scoring and Modern Views of the Writing Construct”. Assessing Writing 18 (1): 7–24.
- DiPardo, A., B. A. Storms y M. Selland. 2011. “Seeing Voices: Assessing Writerly Stance in the NWP Analytic Writing Continuum”. Assessing Writing 16 (3): 170–188.
- East, M. 2009. “Evaluating the Reliability of a Detailed Analytic Scoring Rubric for Foreign Language Writing”. Assessing Writing 14 (2): 88–115.
- Eckes, T. 2009. “Many-Facet Rasch Measurement”. En Reference Supplement to the Manual for Relating Language Examinations to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment (Section H), ed. S. Takala. Estrasburgo: Consejo de Europra/División de Política de la Lengua. https://rm.coe.int/1680667a23
- Eckes, T. 2012. “Operational Rater Types in Writing Assessment: Linking Rater Cognition to Rater Behavior”. Language Assessment Quarterly 9 (3): 270–292.
- Enright, M. K. y T. Quinlan. 2010. “Complementing Human Judgment of Essays Written by English Language Learners with E-Rater(R) Scoring”. Language Testing 27 (3): 317–334.
- Engelhard, G. y S. A. Wind. 2013. “Rating Quality Studies Using Rasch Measurement Theory”. Research Report 3. Nueva York: The College Board.
- Esfandiari, R. y C. M. Myford. 2013. “Severity Differences among Self-Assessors, Peer-Assessors, and Teacher Assessors Rating EFL Essays”. Assessing Writing 18 (2): 111–131.
- González, E. F., N. P. Trejo y R. Roux. 2017. “Assessing EFL University Students’ Writing: A Study of Score Reliability”. Revista Electrónica de Investigación Educativa 19 (2): 91–103.
- Hamp-Lyons, L. 2007. “Worrying about Rating”. Assessing Writing 12: 1–9.
- Han, T. 2017. “Scores Assigned by Inexpert EFL Raters to Different Quality EFL Compositions, and the Raters’ Decision-Making Behaviors”. International Journal of Progressive Education 13 (1): 136–152.
- Huang, J. y C. J. Foote. 2010. “Grading between the Lines: What Really Impacts Professors’ Holistic Evaluation of ESL Graduate Student Writing?” Language Assessment Quarterly 7 (1): 37–41.
- Jonsson, A. y G. Svingby. 2007. “The Use of Scoring Rubrics: Reliability, Validity and Educational Consequences”. Educational Research Review 2 (2): 130–144.
- Kane, M.T. 2013. “Validating the Interpretations and Uses of Test Scores”. Journal of Educational Measurement 50 (1): 1–73.
- Knoch, U. 2007a. “Do Empirically Developed Rating Scales Function Differently to Conventional Rating Scales for Academic Writing?” Spaan Fellow Working Papers in Second or Foreign Language Assessment 5: 1–36.
- Knoch, U. 2007b. Diagnostic Writing Assessment. The Development and Validation a Rating Scale. Tesis doctoral, University of Auckland.
- Knoch, U. 2009. “Diagnostic Assessment of Writing: A Comparison of Two Rating Scales”. Language Testing 26 (2): 275–304.
- Lim, G. S. 2011. “The Development and Maintenance of Rating Quality in Performance Writing Assessment: A Longitudinal Study of New and Experienced Raters”. Language Testing 28 (4): 543–560.
- Linacre, J. M. 2012. “Facets Tutorial 2”: 1–40. http://www.winsteps.com/a/ftutorial2.pdf
- Linacre, J. M. 2015. Facets Computer Program for Many-Facet Rasch Measurement, version 3.71.4. Beaverton, OR: Winsteps.com
- Mendoza, A. 2014. “Las prácticas de evaluación docente y las habilidades de escritura requeridas en el nivel posgrado”. Innovación Educativa 14 (66): 147–175.
- Mendoza, A. 2015. “La selección de las tareas de escritura en los exámenes de lengua extranjera destinados al ámbito académico”. Revista Nebrija de Lingüística Aplicada 18: 106–123.
- Mendoza, A. 2017. El argumento de validación de la prueba escrita del examen de español como lengua extranjera para el ámbito académico de la UNAM. Tesis doctoral, Universidad Nacional Autónoma de México.
- Mendoza, A. 2018. “El uso de Many-Facet Rasch Measurement para examinar la calidad del proceso de corrección de pruebas de desempeño”. Revista Mexicana de Investigación Educativa 23 (77): 597–625.
- Mendoza, A. y U. Knoch. 2018. “Examining the Validity of the Analytic Rating Scale for a Spanish Test for Academic Purposes Using the Argument-Based Approach to Validation”. Assessing Writing 35: 41–55. https://doi.org/10.1016/j.asw.2017.12.003
- Myford, C. M. 2012. “Rater Cognition Research: Some Possible Directions for the Future”. Educational Measurement: Issues and Practice 31 (3): 48–49.
- Myford, C. M. y E. W. Wolfe. 2003. “Detecting and Measuring Rater Effects Using Many-Facet Rasch Measurement: Part I”. Journal of Applied Measurement 4: 386–422.
- Prieto, G. 2011. “Evaluación de la ejecución mediante el modelo Many-Facet Rasch Measurement”. Psicothema 23 (2): 233–238.
- Rezaei, A. R. y M. Lovorn. 2010. “Reliability and Validity of Rubrics for Assessment through Writing”. Assessing Writing 15 (1): 18–39.
- Wang, Z. y L. Yao 2013. “The Effects of Rater Severity and Rater Distribution on Examinees’ Ability Estimation for Constructed-Response Items”. ETS Research Report No. RR:13–23.
- Weigle, S. C. 2002. Assessing Writing. Cambridge: Cambridge University Press.
- Weigle, S. C. y K. Parker. 2012. “Source Text Borrowing in an Integrated Reading/Writing Assessment”. Journal of Second Language Writing 21: 118–133.
- Wind, S. A. y G. Engelhard. 2013. “How Invariant and Accurate are Domain Ratings in Writing Assessment?” Assessing Writing 18 (4): 278–299.
- Wind, S. A., C. Stager y Y. J. Patil. 2017. “Exploring the Relationship between Textual Characteristics and Rating Quality in Rater-Mediated Writing Assessments: An Illustration with L1 and L2 Writing Assessments”. Assessing Writing 34: 1–15.
- Wolfe, E. W. y A. McVay. 2012. “Application of Latent Trait Models to Identifying Substantively Interesting Raters”. Educational Measurement: Issues and Practice 31 (3): 31–37.
- Zhao, C. G. 2012. “Measuring Authorial Voice Strength in L2 Argumentative Writing: The Development and Validation of an Analytic Rubric”. Language Testing 30 (2): 201–230.