CrossRef citations to date

Spanish corpora and their pedagogical uses: challenges and opportunities

Corpus de español y sus usos pedagógicos: desafíos y oportunidades

, &
Pages 105-115 | Received 15 Jun 2022, Accepted 15 Nov 2022, Published online: 29 Dec 2022


  • Academia Mexicana de la Lengua. Corpus Diacrónico y Diatópico del Español de América (CORDIAM). <www.cordiam.org>
  • Albelda, M. and M. Estellés. coords. Corpus Ameresco < www.corpusameresco.com>
  • Anthony, L. 2022. AntConc (Version 4.0.11) [Computer Software]. Tokyo, Japan: Waseda University. Available from https://www.laurenceanthony.net/software
  • Asención-Delaney, Y., J. G. Collentine, K. Collentine, J. Colmenares, and L. Plonsky. 2015. “El potencial de la enseñanza del vocabulario basada en corpus: optimismo con precaución.” Journal of Spanish Language Teaching 2 (2): 140-151.
  • Bell, P., L. Collins, and E. Marsden. 2021. “Building an Oral and Written Learner Corpus of a School Programme: Methodological Issues.” Learner Corpus Research Meets Second Language Acquisition: 214–242.
  • Benavides, C. 2015. “Using a Corpus in a 300-Level Spanish Grammar Course.” Foreign Language Annals 48 (2): 218-235.
  • Biber, D., S. Conrad, and R. Reppen. 1998. Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press.
  • Biber, D., M. Davies, J.K. Jones, and N. Tracy-Ventura. 2006. “Spoken and Written Register Variation in Spanish: A Multi-Dimensional Analysis.” Corpora 1 (1): 1-37.
  • Biber, D. and R. Reppen. 2012. “Introduction.” In Corpus Linguistics (Vols. 1-4), eds. D. Biber and R. Reppen, SAGE Publications Ltd, https://doi.org/10.4135/9781446261217
  • Boulton, A. 2010. “Data-Driven Learning: Taking the Computer out of the Equation.” Language Learning 60 (3): 534-572.
  • Bowles, M. A. 2022. “Using Instructor Judgment, Learner Corpora, and DIF to Develop a Placement Test for Spanish L2 and Heritage Learners.” Language Testing 39 (3): 355-376.
  • Callies, M. and M. Paquot. 2015. “Learner Corpus Research: An Interdisciplinary Field on the Move.” International Journal of Learner Corpus Research 1 (1): 1-6.
  • Chen, M., J. Flowerdew, and L. Anthony. 2019. “Introducing In-Service English Language Teachers to Data-Driven Learning for Academic Writing.” System 87: 102148.
  • Colegio de México. Corpus del Español Mexicano Contemporáneo (CEMC). <http://www.corpus.unam.mx/cemc>
  • Czerwionka, L. and D. J. Olson. 2020. “Pragmatic Development during Study Abroad: L2 Intensifiers in Spoken Spanish.” International Journal of Learner Corpus Research 6 (2): 125-162.
  • Davidson, S., A. Yamada, P.F. Mira, A. Carando, C. H. Sánchez-Gutiérrez, and K. Sagae. 2020, May. Developing NLP tools with a New Corpus of Learner Spanish. In Proceedings of the 12th Language Resources and Evaluation Conference (pp. 7238-7243).
  • Davies, M. 2016. Corpus del Español: Two Billion Words, 21 countries. <http://www.corpusdelespanol.org>
  • Díaz Rodríguez, L. 2002. Interferencias Discursivas de Hablantes Bilingües Castellano/Catalán: Uso Oral y Escrito. In Seminari sobre les llengües i educació de l’Estat, ed. J. Perera. Barcelona: Ice-Horsori.
  • Domínguez, L., N. Tracy-Ventura, M.J. Arche, R. Mitchell, and F. Myles. 2013. “The Role of Dynamic Contrasts in the L2 Acquisition of Spanish Past Tense Morphology.” Bilingualism: Language and Cognition 16 (3): 558-577.
  • Fernández-Mira, P., E. Morgan, S. Davidson, A. Yamada, A. Carando, K. Sagae, and C. H. Sánchez-Gutiérrez. 2021. “Lexical Diversity in an L2 Spanish Learner Corpus: The Effect of Topic-Related Variables.” International Journal of Learner Corpus Research 7 (2): 230-258.
  • Gablasova, D., V. Brezina, and T. McEnery. 2019. “The Trinity Lancaster Corpus: Development, Description and Application.” International Journal of Learner Corpus Research 5 (2): 126-158.
  • García, O. and L. Wei. 2014. Translanguaging: Language, Bilingualism and Education. London: Palgrave.
  • Gilquin, G. 2022. “The Process Corpus of English in Education: Going Beyond the Written Text.” Research in Corpus Linguistics 10 (1): 31-44.
  • Gilquin, G. and S. Granger. 2010. “How Can Data-Driven Learning be Used in Language Teaching?” In The Routledge Handbook of Corpus Linguistics, eds. A. O’Keeffe and M.McCarthy, 359-370. London and New York: Routledge.
  • Godwin-Jones, R. 2017. “Data-Informed Language Learning.” Language Learning & Technology 21 (3): 9-27.
  • Götz, S. and J. Mukherjee. eds. 2019. Learner Corpora and Language Teaching. Amsterdam: John Benjamins.
  • Granger S. 2021. “Have Learner Corpus Research and Second Language Acquisition Finally Met?” In Learner Corpus Research Meets Second Language Acquisition, eds. B. Le Bruyn and M. Paquot, 243-257. Cambridge: Cambridge University Press.
  • Granger, S., E. Dagneaux, and F. Meunier. eds. 2002. The International Corpus of Learner English: Handbook and CD-ROM. Louvain-la-Neuve, Belgium: Presses Universitaires de Louvain. (Available from http://www.i6doc.com)
  • Gudmestad, A., A. Edmonds, and T. Metzger. 2021. “Moving Beyond the Native-Speaker Bias in the Analysis of Variable Gender Marking”. Frontiers in Communication 165.
  • Hidalgo Navarro, A. 2019. De segmentación y prosodia en la conversación coloquial. In Pragmática del español hablado: hacia nuevos horizontes, eds. A. Cabedo Nebot and A. Hidalgo Navarro, 227-238. Valencia: Universitat de València.
  • Instituto Cervantes. 2019. El español: una lengua viva. Madrid: Instituto Cervantes.
  • Jablonkai, R., L. Forti, M. A. Castelló, I. S. Iguenane, E. Schaeffer-Lacroix, and N. Vyatkina. 2020. “Data-Driven Learning for Languages Other than English: The Cases of French, German, Italian, and Spanish.” CALL for Widening Participation: Short Papers from EUROCALL 2020, 132.
  • Johns, T. 1991. “From Printout to Handout: Grammar and Vocabulary Teaching in the Context of Data-Driven Learning”. English Language Research Journal 4: 27–45.
  • Koike, D. and J. Witte. 2016. “Spanish Corpus Proficiency Level Training Website and Corpus: An Open-Source, Online Resource for Corpus Linguistics Studies.” In Spanish Learner Corpus Research: Current Trends and Future Perspectives, ed. M. Alonso Ramos, 169-196. Amsterdam: John Benjamins.
  • Le Bruyn, B. and M. Paquot. eds. 2021. Learner Corpus Research Meets Second Language Acquisition. Cambridge: Cambridge University Press.
  • Leńko-Szymańska, A. 2017. “Training Teachers in Data Driven Learning: Tackling the Challenge.” Language Learning & Technology 21 (3): 217-241.
  • Lin, T.-J. 2005. “Corpus de Textos Escritos por Universitarios Taiwaneses Estudiantes de Español.” Lingüística en la Red 3: 1-58.
  • López Meirama, B. 2020. “Variación diatópica y análisis de corpus: algunos casos en la fraseología del español.” Estudios de lingüística 145-159.
  • Lozano, C. 2009. “CEDEL2: Corpus Escrito del Español como L2.” In Applied Linguistics Now: Understanding Language and Mind/La Lingüística Aplicada Actual: Comprendiendo el Lenguaje y la Mente, eds. C. M. Bretones José, F. Fernández Sánchez, J.R. Ibáñez Ibáñez, M.E. García Sánchez, M.E. Cortés de los Ríos, S. Salaberri Ramiro, M.S. Cruz Martínez, N. Perdú Honeyman, and B. Cantizano Márquez, 197-212. Almería: Universidad de Almería.
  • Lozano, C. 2021. “CEDEL2: Design, Compilation and Web Interface of an Online Corpus for L2 Spanish Acquisition Research.” Second Language Research https://doi.org/10.1177/02676583211050522
  • Marcos Miguel, N. 2020. “Exploring Tasks-as-Process in Spanish L2 Classrooms: Can Corpus-Based Tasks Facilitate Language Exploration, Language Use, and Engagement?” International Journal of Applied Linguistics 31 (2): 211-228.
  • Marsily, A. 2018. “¿Es normal que sea un poco difícil de leer la consigna?” La atenuación en las peticiones de hablantes no nativos de español.” ELUA Anexo 4: 251-268.
  • Marsily, M. 2022. COPINE. Corpus Oral de Peticiones en Interacciones Naturalizadas en Español. Louvain-la-Neuve: Université catholique de Louvain.
  • Martín Butragueño, P. and Y. Lastra. coords. 2011-2015. Corpus Sociolingüístico de la Ciudad de México (CSCM). 1a. ed. Ciudad de México: El Colegio de México.
  • McEnery, T., R. Xiao, and Y. Tono. 2006. Corpus-Based Language Studies: An Advanced Resource Book. London and New York: Routledge
  • Mendikoetxea, A., S. Murcia Bielsa, and P. Rollinson. 2010. Focus on Errors: Learner Corpora as Pedagogical Tools. In Corpus-Based Approaches to English Language Teaching, eds. M. C. Campoy, B. Bellés-Fortuno, and M. L. L. Gea-Valor, 180–194. London: Continuum.
  • Miguel, N. M. 2022. “Exploring the Use of Corpus Tools for Teaching Language Variation to L2 Spanish Majors.” Language 98 (2): e80-e107.
  • Minnillo, S., C. H. Sánchez-Gutiérrez, A. Carando, S. Davidson, P. F. Mira, and K. Sagae. 2022. “Preterit-Imperfect Acquisition in L2 Spanish Writing: Moving Beyond Lexical Aspect.” Research in Corpus Linguistics 10 (1): 156-184.
  • Mitchell, R., L. Domínguez, M.J. Arche, F. Myles, and E. Marsden. 2008. “SPLLOC: A New Database for Spanish Second Language Acquisition Research.” EuroSLA Yearbook 8 (1): 287-304.
  • Moreno Fernández, F. 2009. “El estudio sociolingüístico de las hablas hispánicas. Noticias de PRESEEA.” In La investigación dialectológica en la actualidad, eds. D. Corbella Díaz and J. Dorta Luis, 103-117. Santa Cruz de Tenerife: Agencia Canaria de Investigación y Sociedad de la Información del Gobierno de Canarias.
  • Myles, F. 2021. “Commentary: An SLA Perspective on Learner Corpus Research.” In Learner Corpus Research Meets Second Language Acquisition, eds. B. Le Bruyn and M. Paquot, 258-270. Oxford: Oxford University Press.
  • Paquot, M. 2018. “Corpus Research for Language Learning and Teaching.” In Palgrave Handbook of Applied Linguistics Research Methodology, eds. A. Phakiti, P. De Costa, L. Plonsky, and S. Starfield. London: Palgrave Macmillan.
  • Parodi, G., P. Cantos-Gómez, and C. Howe. 2022. Lingüística de corpus en español / The Routledge Handbook of Spanish Corpus Linguistics. London and New York: Routledge.
  • Pons Bordería, S. dir. Corpus Val.Es.Co 3.0. <http://www.valesco.es>
  • Poole, R. 2020. “‘Corpus can be Tricky’: Revisiting Teacher Attitudes towards Corpus-Aided Language Learning and Teaching.” Computer Assisted Language Learning 1-22. Doi: 10.1080/09588221.2020.1825095.
  • Real Academia Española. Banco de datos (CORPES XXI) Corpus del Español del Siglo XXI. <http://www.rae.es>
  • Real Academia Española. Banco de datos (CREA) Corpus de Referencia del Español actual. <http://www.rae.es>
  • Real Academia Española. Banco de datos (CORDE) Corpus Diacrónico del Español actual. <http://www.rae.es>
  • Reinhardt, J. 2010. “The Potential of Corpus-Informed L2 Pedagogy.” Studies in Hispanic & Lusophone Linguistics 3 (1): 239-252.
  • Rojo, G., and M. I. M. Palacios. 2016. “Learner Spanish on Computer: The CAES Corpus de Aprendices de Español Project.” In Spanish Learner Corpus Research: Current trends and future perspectives, ed. M. Alonso Ramos, 55-87. Amsterdam: John Benjamins.
  • Sampedro Mella, M. 2021. “Estimado Sr. Vs. Hola hotel: el análisis contrastivo de la interlengua para la enseñanza de la variación pragmático-discursiva.” In La Variación en Español y su Enseñanza: Reflexiones y Propuestas Didácticas, 133-150. Ediciones Universidad de Salamanca.
  • Sánchez-Gutiérrez, C. H. and P. Fernández-Mira. 2022. “Datos Longitudinales en Corpus de Aprendientes de Español.” In Lingüística de corpus en español, eds. G. Parodi, P. Cantos-Gómez and C. Howe, 374-387. London and New York: Routledge.
  • Taguchi, N. 2015. “Instructed Pragmatics at a Glance: Where Instructional Studies Were, Are, and Should Be Going.” Language Teaching 48 (1): 1-50.
  • Tracy-Ventura, N. and A. Huensch. 2018. “The Potential of Publicly Shared Longitudinal Learner Corpora in SLA Research.” In Critical Reflections on Data in Second Language Acquisition, eds. A. Gudmestad and A. Edmonds, 149-170. Amsterdam: John Benjamins.
  • Tracy-Ventura, N., R. Mitchell, and K. McManus. 2016. “The LANGSNAP Longitudinal Learner Corpus: Design and Use.” In Spanish Learner Corpus Research: Current Trends and Future Perspectives, ed. M. Alonso Ramos, 117–142. Amsterdam: John Benjamins.
  • Tracy-Ventura, N. and F. Myles. 2015. “The Importance of Task Variability in the Design of Learner Corpora for SLA Research.” International Journal of Learner Corpus Research 1 (1): 58-95.
  • Tracy-Ventura, N. and M. Paquot. eds. 2021. The Routledge Handbook of Second Language Acquisition and Corpora. London and New York: Routledge.
  • Vázquez Veiga, N. 2016. “Discourse Markers in CEDEL2 and SPLLOC Corpora of Learner Spanish.” Spanish Learner Corpus Research: Current Trends and Future Perspectives, ed. M. Alonso Ramos, 267. Amsterdam: John Benjamins.
  • Yamada, A., S. Davidson, P. Fernández-Mira, A. Carando, K. Sagae, and C. Sánchez-Gutiérrez. 2020. “COWS-L2H: A Corpus for Measuring Learner Spanish Writing Development.” Research in Corpus Linguistics 8 (1): 17–32.
  • Yanto, E. S. and S.I. Nugraha. 2017. “The Implementation of Corpus-Aided Discovery Learning in English Grammar Pedagogy.” Journal of ELT Research: The Academic Journal of Studies in English Language Teaching and Learning, 66-83.
  • Yao, G. 2019. “Vocabulary Learning through Data-Driven Learning in the Context of Spanish as a Foreign Language.” Research in Corpus Linguistics 7: 18-46.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.