Publication Cover
Educational Psychology
An International Journal of Experimental Educational Psychology
Volume 37, 2017 - Issue 1: Second Language Writing
5,100
Views
8
CrossRef citations to date
0
Altmetric
Editorial

Role of assessment in second language writing research and pedagogy

&
View correction statement:
Erratum

Research into second language writing has developed in depth and scope over the past few decades. Researchers have shown growing interest in new approaches to the teaching and assessing of writing. The provision of diagnostic and/or (automated) corrective feedback (Lee & Coniam, Citation2013; Liu & Kunnan, Citation2016), predicting writers’ ability using psycholinguistic features of their essays (Riazi, Citation2016) and rater performance (Schaefer, Citation2008) are but a few major research streams in second language writing. Such new approaches have been discussed heatedly in the scholarly literature, but there remains a need to investigate new issues emerging from these fields in different environments. Specifically, the role of assessment in writing, the validity of the uses and interpretations of qualitative feedback and scores, and the effectiveness of genre-based approaches to writing continue to be major causes of concerns for practitioners and researchers alike.

Assessment is a burgeoning field in second language writing research and can be characterised as comprising three fundamental subfields. The first line of research focuses on the development of assessment instruments for various purposes and stakes, including proficiency and/or in-class assessment. Specifically, developing and validating rubrics for diagnostic assessment (Kim, Citation2011) and investigating the quality of feedback provided by teachers (Bruton, Citation2009; Diab, Citation2011) are areas that have drawn intense interest. For example, Kim (Citation2011) investigated the validity of an assessment tool (checklist) for providing diagnostic feedback on writing, which achieved considerable success. Kim showed that the assessment tool – which taps into five dimensions of writing (e.g. ‘content fulfilment’ and ‘organizational effectiveness’) – is psychometrically reliable and can be used effectively to diagnose English learners’ errors. Kim’s research is one of the first studies that examines the underlying structure of a diagnostic writing tool, and more research of this kind is required to explore the benefits that language learners can reap from such pedagogical instruments. Specifically, the development and validation of reliable in-class assessments, which includes diagnostic and formative tasks, is an area in second language writing that requires further research attention and investment.

The second line of research in second language writing concerns the validity of the uses and interpretations of scores and qualitative feedback provided by teachers/raters as well as automated writing evaluation (AWE) systems such as Criterion, My Access and WriteToLearn (Attali & Burstein, Citation2005; Dikli, Citation2010; Dikli & Bleyle, Citation2014; Liu & Kunnan, Citation2016). A recent meta-analysis suggests that the effectiveness and role of teacher feedback in second language writing remains an open question (Liu & Brown, Citation2015), which is in line with an extensive survey of the literature conducted by Van Beuningen (Citation2010). Van Beuningen divided feedback research into the learning-to-write approach (Leki, Cumming, & Silva, Citation2008) and the writing-to-learn approach (Ortega, Citation2009). The former approach aims to foster students’ writing ability through continuous corrective feedback and to help them become independent and effective writers. On the other hand, the writing-to-learn agenda, which emerges from, for example, Manchón (Citation2009) and Ortega (Citation2009), takes a more quantitative approach by applying more controlled research methods and looking into the psycholinguistic and (meta)cognitive aspects of learning (Van Beuningen, Citation2010). This stream of research provides convincing evidence that, despite all articulated objections and controversies, providing corrective feedback to students can facilitate their learning and have enduring impacts on some writing skills (Van Beuningen, Citation2010; Van Beuningen, De Jong, & Kuiken, Citation2012), although considerably more time and effort would be required to enhance discourse-level writing skills (Truscott & Hsu, Citation2008). By contrast, research on the accuracy and precision of qualitative feedback generated by AWE systems is inconclusive (Stevenson & Phakiti, Citation2014), though some AWE systems have been shown to achieve significantly high reliability in scoring (Liu & Kunnan, Citation2016). Despite these validity challenges, the provision of feedback – whether provided by teachers or AWE systems – continues to be a widely accepted practice in second language writing programmes, thus warranting further investigation.

The third line of research focuses on the effectiveness of genre-based approaches to writing. Like L1 students, L2 students are required to participate in and produce different types of writing deemed acceptable in particular university disciplines. The genre-based approach to teaching writing to university students in general, and to second language students in particular, has received growing attention from different stakeholders in recent decades. Genre theory and pedagogy are usually attributed to three traditions (Hyon, Citation1996): The Sydney School (Martin, Citation1992), which is drawn on the Systemic Functional Linguistics (SFL; Halliday, Citation1987, Citation1994); English for Specific Purposes (ESP; Swales, Citation1990); and the North American New Rhetoric (Freedman & Medway, Citation1994; Miller, Citation1984). Flowerdew (Citation2002) postulated that these three theoretical traditions could represent two approaches to genre theory and pedagogy: a text-based approach and a situation-oriented approach. Johns (Citation2008) has referred to the text-based approach as ‘genre acquisition’ and to the situation-based approach as ‘genre awareness’. A text-based or genre acquisition approach puts the main focus on the linguistic and generic features of the texts. Both SFL and ESP may be considered text-based approaches to genre theory and practice. The idea is that, through analysis of different genre types related to different disciplines, it is possible to train students to learn the terminology as well as the generic features of texts related to particular fields such that they could reproduce them. A critique of the text-based approach is that ‘genre’ is not located in specific texts; rather, genre is rooted in socially situated communicative events. For students, therefore, to learn to produce appropriate genres they need to become familiar (develop genre awareness) with the sociocultural norms of the disciplines and be able to negotiate texts for specific situations. This is based on the grounds that the context in which people work or write influences how they think, and vice versa. New Rhetoric, based on activity theory (Russell, Citation1997), thus surpasses the concept of genre as text and considers ‘the immediate contexts in which texts from a genre are produced, the roles of readers and writers in those texts, their ideologies, and the communities to which they belong and many other factors influencing writers’ (Johns, Citation2008, p. 243). However, Johns asserted that ESP and SFL text-based approaches better lend themselves to genre pedagogy – especially for novice writers – than does the New Rhetoric approach. Thus, when it comes to the effectiveness of genre-based approaches to writing, it seems that stakeholders need to consider the potential of the above three approaches to genre theory and pedagogy.

In the second part of this editorial, we present a synopsis of the Special Issue and the individual articles, selected through rigorous review processes, for inclusion in this Special Issue.

Review of the special issue

The current Special Issue comprises four research articles alongside an Epilogue that, together, investigate the aforementioned areas. The articles make several contributions to the field of second language writing assessment by addressing the themes stressed in the call for proposals. First, the approach applied in the articles is primarily quantitative, although this is not to underestimate the value of qualitative approaches or mixed method research. The quantitative approach has multiple advantages: it allows generalisation of findings to broader populations; it helps establish statistical relationships between variables in large samples; and it eases the communication of findings to teachers, the general public and researchers. Second, the articles examine the assessment of learning as well as AWEs in various parts of the world, i.e. North America, the Middle East and eastern Asia, thereby offering new insights into the application of genre-based teaching alongside electronic raters. The Middle East and eastern Asia, despite their high potential for language learning and assessment programmes, are particularly underrepresented in the extant scholarly literature (see Aryadoust & Fox, Citation2016), and the Special Issue seeks to highlight some of the programmes and practices that are currently ongoing in these regions. Third, the articles provide ample discussions on the development of assessment rubrics, materials and/or validation studies, allowing future researchers to replicate the studies and practitioners to adapt the techniques and methodologies pertinent to their environments. Although the primary focus of the Special Issue is assessment, the articles also discuss pedagogical implications, thereby integrating research and practice. Due to this focus, the articles cater to the needs and interests of practitioners and researchers in the field of second language writing. In the Epilogue to the Special Issue, the editors provide guidelines and directions for future research into assessing second language writing.

The four articles included in this Special Issue are briefly introduced below.

Automated writing evaluation

The first article, by Ranalli, Link and Chukharev-Hudilainen, addresses several issues surrounding the use of AWE for formative assessment of second language writing. Ranalli et al. grounded their research within Kane’s (Citation2015) evidence-based validity argument framework to examine the accuracy and usefulness of qualitative feedback provided by Criterion. Ranalli et al. carried out two studies to examine the evaluation inference (i.e. whether the feedback furnished by Criterion is accurate) and the utilisation inference (i.e. whether the uses of the scores furnished by Criterion are supported by evidence). The researchers reported that Criterion is ‘adequately’ accurate at identifying errors. This finding differs from previous research where Criterion was used in other contexts; nevertheless, Criterion’s accuracy at detecting certain error types was significantly low and it remained insensitive to several errors, indicating the potential for rebuttals against the evaluation inference. The observed differences in Criterion’s performance between this study and previous research also warrant the need for further research to demarcate Criterion’s scope of use and identify the sources of variation across different studies and contexts. In their second study, which draws on Chapelle, Cotos, and Lee (Citation2015), Ranalli et al. reported that students managed to rectify slightly more than half of the identified errors. Overall, the study suggests that the application of Criterion is more viable for formative assessment in low-level writing courses where focus on form is the crux of the course, while Criterion is likely less useful in high-level writing courses where the emphasis is on the development of written communication skills. A consideration in augmenting AWE systems would be the provision of specific information concerning the type and location of errors as well as guidelines for rectifying them. Ranalli et al.’s study further suggests the flexibility of the validity argument framework in AWE research, as further assumptions can be spelled out and examined in different contexts where different facets and parameters play a part.

Another paper in this Special Issue reports a study conducted by Bai and Hu and relates to students’ feedback using Pigai, a AWE software developed for EFL students in China. The software uses a corpus to assess students’ writing quality and builds a scoring model by calculating the quantitative differences between the vocabulary, grammar, structure and content of the essay with those of pre-scored essays. Bai and Hu report on a study in which 30 undergraduate students in China majoring in English evaluate the validity and accuracy of Pigai feedback, the extent to which students use feedback provided by Pigai in the revision of their drafts, and their perception of Pigai. The researchers reported that the students attended to more formal feedback (mechanics and grammar) and produced more successful revisions than they did with the meaning-preserving feedback (collocations and synonyms). Bai and Hu also reported that the students mostly attended to superficial revisions, such as spelling, capitalisation and spacing for punctuation, as well as revising most of the grammatical errors, but not collocations and word choices.

Broadly speaking, an advantage of AWE systems might be that they provide immediate feedback to student writers who can use the feedback to revise their drafts. We are still, however, far from making generalisations about the usefulness of AWE systems in terms of their pedagogical and assessment use.

Genre-based approach to writing

The third research article was contributed by Rakedzon and Baram-Tsabari who addressed the problem of whether an English academic writing module can enhance graduate students’ general academic and popular science writing skills. The researchers chose to do an experiment, using a quasi-experimental design, to investigate the effectiveness of teaching students to produce popular science writing. They used pre- and post-assessment tasks to examine the effectiveness of the treatment, i.e. whether the writing course was able to improve students’ academic and popular science writing skills. The study found a significant development in the academic and popular science writing skills of the students, alongside improvements in their English language proficiency.

The importance of Rakedzon and Baram-Tsabari’s paper relates to the need for future scientists to be able to communicate complicated scientific findings in plain language understandable by the public. Although there is a plethora of research on general academic writing, there is a paucity of studies in which students are trained to write and communicate scientific results to a general audience. Studies like this are thus useful because they provide guidelines to curriculum developers and assessment task designers for designing courses and helping future scientists learn how to communicate significant findings in their fields to society.

While academic writing in general and scientific writing in particular emphasise communicating new research findings in specialised journals, popular scientific communication is mostly published in magazines, newspapers, blogs and social media. The overall genre of popular science is news articles. Compared to conventional academic and scientific articles, news articles follow an inverted pyramid format that emphasises the results, which is the most important part to the public, with less emphasis on the methodology and background. Students therefore need to become familiar with this new and essential genre in their university writing courses. Rakedzon and Baram-Tsabari’s paper elaborates on how, through a 14-week course mainly aimed ‘to prepare graduate students for the demands of academic writing in English’, 177 L2 science and engineering graduate students learned to produce popular science writing.

Cognitive diagnostic assessment of writing

The last research article in the Special Issue was written by Xie and applied the reduced Reparameterized Unified Model (RUM; AKA the reduced fusion model), a type of Cognitive Diagnostic Modelling (CDM) approach, to investigate the validity of a diagnostic writing checklist for Chinese students’ academic writing in English. Xie argued that diagnostic writing assessment presents a set of challenges, not least the provision of useful and conducive feedback as well as the validation of the assessment instruments. She conducted a review of the available diagnostic models and elaborated on fundamental concepts in CDM, such as the Q-matrix and RUM parameters. Xie adapted Kim’s (Citation2011) diagnostic checklist consisting of several writing skills each assessed by multiple items. Next, a series of iterative RUM analyses resulted in a parsimonious model with higher discrimination between masters and non-masters of the writing skills. The validated checklist provides more useful information than the traditional raw scores and differentiates between high, mid and low ability writers. Each of these groups had a specific skill mastery/non-mastery pattern, which would be intractable using raw scores.

Xie’s study shows that quantitative CDM feedback offers important advantages over conventional holistic approaches where a single combined test score is generated. The results of the study suggest that school systems and the prioritisation of certain language skills can have a profound impact on students’ mastery/non-mastery of language skills in general and writing skills in particular. Therefore, it is important to consider the role of educational environments in formulating research goals and setting expectations for students. Another important finding is that students with the same raw test scores do not necessarily possess the same set of skills and students with different raw scores might not have extremely different skill sets, suggesting the need to investigate the utility of raw scores in educational systems. Adapting a CDM approach to assessing writing could be limited by logistic constraints, budget and the deployment of trained staff. As such, Xie proposes that self- and peer assessment of writing (or other language skills) could be considered and investigated as an alternative approach in low- and mid-stakes assessments – depending on the measurement qualities of the involved students.

Conclusion

The studies in this Special Issue have investigated several important issues in assessing second language writing and have discussed the implications of their findings for teaching and assessment. The studies do not provide a conclusive answer to the lingering questions in the field of second language writing, though they do shed fresh light on these questions and offer suggestions for future research.

One of the important goals of this Special Issue was to examine the application of quantitative methods in second language writing research, which we believe has been fulfilled. The use of quantitative techniques has the advantage of generalisability of the findings and easier verifiability of the results in future research. In addition, quantitative techniques simplify the interpretation of the results of second language writing research and help teachers and researchers alike to identify and remedy potential problems that exert an overall influence on learners. With the exception of CDM in Xie’s paper, the techniques used in the articles in this Special Issue are between-learner models, which, as stated, help improve the generalisability of the findings, although the results are not specifically applicable to individuals. For example, Rakedzon and Baram-Tsabari found that, overall, genre-based teaching influences second language popular science writing; this may not be interpreted as evidence at the within-subject level, meaning that the reported effect sizes are for the entire sample, not individuals (Borsboom, Citation2005).

On the other hand, CDM, as used by Xie, is a step towards within-learner assessment as it provides learner-specific information about the mastery of operationalised writing skills. Within-learner approaches to assessment can benefit the field of second language writing/learning in two important ways. Firstly, they provide fine-grained learner-specific information, thereby helping to move teaching and assessment toward more personalised techniques. Secondly, they do not prescribe the same remedies for the entire sample of learners, thereby making learning more effective. We discuss this approach further in the Epilogue to this Special Issue.

Finally, with the advent of new artificial intelligence (AI) and computational linguistics techniques, we envision that the accuracy and utility of AWE systems will increase exponentially. We recognise that these systems do not ‘think’ the way humans do; however, considering the undeniable role of AI in recent scientific advancements, the way these systems ‘think’ seems to be of less importance than the outputs of their ‘thinking’. We anticipate that as research into AWE systems grows, highly accurate algorithms will be made available to learners and teachers around the world.

The promotion and propagation of quantitative methods to assessing writing should not be taken as the underestimation of qualitative research. We can obtain the best of both worlds by establishing a dialogue between the two fields and applying mixed-methods research (MMR). While quantitative methods are recognised for their reliability and thus the generalisability of the results, they have been criticised by some for being reductionist and focusing on the proverbial trees and not the woods. Qualitative methods, on the other hand, are well recognised for putting problems into context and thus enabling researchers to have a wider scope. MMR, where both quantitative and qualitative methods are mixed at all levels through the research process, are promising for addressing issues related to second language writing (Riazi, Citationin press). Future studies on second language writing using MMR are therefore expected through other venues and outlets.

Vahid Aryadoust
English Language and Literature Academic Group, National Institute of Education, Nanyang Technological University, Singapore
[email protected] Riazi
Department of Linguistics, Macquarie University, Australia

References

  • Aryadoust, V., & Fox, J. (Eds.). (2016). Trends in language assessment research and practice: The view from the Middle East and the Pacific Rim. Newcastle: Cambridge Scholars Publishing.
  • Attali, Y., & Burstein, J. (2005). Automated essay scoring with e-rater® V.2.0 (ETS research report number RR-04-45). Retrieved from http://www.ets.org/Media/Research/pdf/RR-04-45.pdf
  • Borsboom, D. (2005). Measuring the mind. Cambridge, UK: Cambridge University Press.10.1017/CBO9780511490026
  • Bruton, A. (2009). Designing research into the effects of grammar correction in L2 writing: Not so straightforward. Journal of Second Language Writing, 18, 136–140.10.1016/j.jslw.2009.02.005
  • Chapelle, C. A., Cotos, E., & Lee, J. Y. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, 32, 385–405.10.1177/0265532214565386
  • Diab, N. M. (2011). Assessing the relationship between different types of student feedback and the quality of revised writing. Assessing Writing, 16, 274–292.10.1016/j.asw.2011.08.001
  • Dikli, S. (2010). The nature of automated essay scoring feedback. Computer Assisted Language Instruction Consortium, 28, 99–134.
  • Dikli, S., & Bleyle, S. (2014). Automated Essay Scoring feedback for second language writers: How does it compare to instructor feedback? Assessing Writing, 22, 1–17.10.1016/j.asw.2014.03.006
  • Flowerdew, J. (2002). Genre in the classroom: A linguistic approach. In A. Johns (Ed.), Genre in the classroom: Multiple perspectives (pp. 91–102). Mahwah, NJ: Lawrence Erlbaum Associates.
  • Freedman, A., & Medway, P. (Eds.). (1994). Genre and the new rhetoric. London: Taylor and Francis.
  • Halliday, M. A. K. (1987). Spoken and written modes of meaning. In R. Horowitz & S. Jay (Eds.), Comprehending Oral and Written Language (pp. 55–82). San Diego, CA: Academic Press.
  • Halliday, M. A. K. (1994). An introduction to functional grammar (2nd ed.). London: Edward Arnold.
  • Hyon, S. (1996). Genre in three traditions: Implications for ESL. TESOL Quarterly, 30, 693–722.10.2307/3587930
  • Johns, A. M. (2008). Genre awareness for the novice academic student: An ongoing quest. Language Teaching, 41, 237–252.
  • Kane, M. T. (2015). Explicating validity. Assessment in Education: Principles, Policy & Practice, 23(2), 1–14.
  • Kim, Y. H. (2011). Diagnosing EAP writing ability using the reduced reparameterized unified model. Language Testing, 28, 509–541.10.1177/0265532211400860
  • Lee, I., & Coniam, D. (2013). Introducing assessment for learning for EFL writing in an assessment of learning examination-driven system in Hong Kong. Journal of Second Language Writing, 22, 34–50.10.1016/j.jslw.2012.11.003
  • Leki, I., Cumming, A., & Silva, T. (2008). A synthesis of research on second language writing in English. London: Routledge.
  • Liu, Q., & Brown, D. (2015). Methodological synthesis of research on the effectiveness of corrective feedback in L2 writing. Journal of Second Language Writing, 30, 66–81.10.1016/j.jslw.2015.08.011
  • Liu, S., & Kunnan, A. J. (2016). Investigating the application of automated writing evaluation to Chinese undergraduate English majors: A case study of WriteToLearn. Computer Assisted Language Instruction Consortium, 33, 71–91.
  • Manchón, R. M. (2009). Broadening the perspective of L2 writing scholarship: The contribution of research on foreign language writing. In R. M. Manchón (Ed.), Writing in foreign language contexts: Learning, teaching, and research (pp. 1–19). Clevedon: Multilingual Matters.
  • Martin, J. R. (1992). English text. Philadelphia, PA: John Benjamins Publishing Company.10.1075/z.59
  • Miller, C. (1984). Genre as social action. Quarterly Journal of Speech, 70, 151–167.10.1080/00335638409383686
  • Ortega, L. (2009). Studying writing across EFL contexts: Looking back and moving forward. In R. M. Manchón (Ed.), Writing in foreign language contexts: Learning, teaching, and research (pp. 232–255). Clevedon: Multilingual Matters.
  • Riazi, A. M. (2016). Comparing writing performance in TOEFL-iBT and academic assignments: An exploration of textual features. Assessing Writing, 28, 15–27.10.1016/j.asw.2016.02.001
  • Riazi, A. M. (in press). Mixed methods approaches to studying second language writing. To be published in The TESOL Encyclopedia of English Language Teaching by John Wiley & Sons.
  • Russell, D. (1997). Rethinking genre in school and society: An activity theory analysis. Written Communication, 14, 504–554.10.1177/0741088397014004004
  • Schaefer, E. (2008). Rater bias patterns in an EFL writing assessment. Language Testing, 25, 463–492.
  • Stevenson, M., & Phakiti, A. (2014). The effects of computer-generated feedback on the quality of writing. Assessing Writing, 19, 51–65.10.1016/j.asw.2013.11.007
  • Swales, J. M. (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press.
  • Truscott, J., & Hsu, A. Y.-P. (2008). Error correction, revision, and learning. Journal of Second Language Writing, 17, 292–305.10.1016/j.jslw.2008.05.003
  • Van Beuningen, C. (2010). Corrective feedback in L2 writing: Theoretical perspectives, empirical insights, and future directions. International Journal of English Studies, 10(2), 1–27.
  • Van Beuningen, C. G., De Jong, N. H., & Kuiken, F. (2012). Evidence on the effectiveness of comprehensive error correction in second language writing. Language Learning, 62(1), 1–41.10.1111/lang.2012.62.issue-1

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.