183
Views
1
CrossRef citations to date
0
Altmetric
Articles

Discourse measures for Basque summary grading

, &
Pages 528-547 | Received 08 Feb 2011, Accepted 03 Jul 2011, Published online: 20 Sep 2011
 

Abstract

In the context of Learning Technologies, the need to be able to assess the learning and domain comprehension in open-ended learner responses has been present in artificial intelligence and education since its beginnings. The advantage of using summaries is that they allow teachers to diagnose comprehension and the amount of information remembered from text in the learning process. This study addresses the issue of automatically obtaining overall discourse scores from surface discourse measures for Basque language. Global measures have been studied for cohesion, adequacy and use of language. The approach taken was to estimate the presence of the automatically gathered surface discourse measures in expert grading decisions in cohesion, adequacy and use of language. As a consequence, three grading decision-making regression models were obtained to estimate overall grades from text written in Basque. Next, the obtained regression models were tested in corpus-containing summaries written by learners with different degrees of summarisation maturity. The results show that the obtained grading frameworks significantly reflect human decisions and are able to discriminate summarisation maturity differences.

Acknowledgments

This work has been partially supported by the Spanish Ministry of Education (TIN2009-14380), the University of the Basque Country UPV/EHU (UE09/09) and the Basque Government (IT421-10). We also thank the IXA natural language processing group for providing the required parsers for this study.

Notes on contributors

Iraide Zipitria received an MSc in Cognitive Science from the University of Edinburgh in 2000 and a BA in Psychology from the University of Deusto in1997. Currently she is about to defend her thesis in the Computer Science Faculty of the University of the Basque Country. From 2003 to 2006 she was teaching in the Faculty of Education of the University of the Basque Country (UPV/EHU) and since 2006 she has been teaching in the Psychology Faculty of the University of the Basque Country (UPV/EHU). Her research interests involve Cogntive Science and Intelligent Tutoring Systems. Current more specific interests include summary assessment, discourse grading, Latent Semantic Analysis, computer based learning assessment, and human-computer interaction.

Ana Arruarte received her PhD in Computer Science from the University of the Basque Country in 1998. Since 1989 she has been teaching and researching in the Department of Computer Languages and Systems at the University of the Basque Country (UPV/EHU). Within the Computer Based Education area, her research is mainly focused on concept mapping, computer based engineering education, intelligent tutoring systems, and learning assessment.

Jon A. Elorriaga received his PhD in Computer Science from the University of the Basque Country in 1998. Since 1996 he has been a faculty member in the Department of Computer Languages and Systems at the University of the Basque Country (UPV/EHU). He has been working in the Computer Based Education area since 1991. His current research interests include open-ended assessment, concept mapping, computer based engineering education, and intelligent tutoring systems.

Notes

1. Non-Indo-European language spoken in the north of Spain and south of France. Grammatically complex, it is an agglutinative, order free and verb final language. A complete English description of Basque grammar can be found in Hualde and Ortiz de Urbina (2003).

2. Pearson correlation mean.

3. Largest effect size values are highlighted in italic and largest R2 values are highlighted in bold.

4. Effect size indicates the amount of influence that changing the conditions of the independent variable had on dependent scores (DeGroot & Schervish, 2002). f 2 is small for 0.02, medium for 0.15 and high for 0.35. For more information about effect sizes see Cohen (1988).

5. In addition to significance analysis a power analysis (1 − β) is included to avoid β type error. In other words, to avoid accepting an erroneous null hypothesis (DeGroot & Schervish, 2002). By means of the power analysis test, we make sure that if there is a relationship we will be able to detect this relationship. For values greater than 0.8 – power values vary from 0 to 1 – we ensure the probability to avoid β-type error. Therefore, we are certain that a relationship exists in nature. For more information see Cohen (1988).

6. Pearson correlation mean.

7. The lemma proportion has been found to be relevant for the Basque language in terms of word variability due to its agglutinative nature (Zipitria, Elorriaga, & Arruarte, 2006b).

8. Pearson correlation mean.

9. Cohen's d indicates the between mean difference effect. d is small for 0.2, medium for 0.5 and large for 0.8. For more information about effect sizes see Cohen (1988).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.