5,487
Views
2
CrossRef citations to date
0
Altmetric
Research Articles

Supporting primary students’ mathematical reasoning practice: the effects of formative feedback and the mediating role of self-efficacy

ORCID Icon, , ORCID Icon, &

ABSTRACT

Mathematical reasoning is a diffcult activity for students, and although standards have been introduced worldwide, reasoning is seldom practiced in classrooms. Productively supporting students during the process of mathematical reasoning is a challenge for teachers, and applying formative feedback might increase students' reasoning effectiveness. Research has shown that the relationship between formative feedback and achievement is rather indirect, e.g. conveyed through students' self-efficacy beliefs. We examined whether the formative feedback perceived by students, as part of a 10-week student training programme, supported the development of reasoning competence via self-efficacy beliefs among 1261 students in 71 primary classes. We used multi-level modelling to analyse the expected relationships. On the class level, formative feedback predicted reasoning, which was mediated by self-efficacy; on the individual level, formative feedback predicted self-efficacy, but not reasoning. The results only partially confirmed our hypotheses. We discuss explanations for these findings and present implications for teaching mathematical reasoning.

Introduction

Standards for teaching mathematics worldwide have enhanced the recognition of the importance of mathematical reasoning as one of the relevant skills for carrying out mathematical procedures. In the United States (US), for example, the Common Core State Standards expect elementary students to be able to reason abstractly and quantitatively, and to construct viable arguments (National Governors Association Center for Best Practices; Council of Chief State School Officers, Citation2010). The aim of the national curriculum in the United Kingdom is to ensure that all pupils practice complex problems and are capable of reasoning mathematically as well as developing an argument using mathematical language (Department for Education, Citation2014). The new Swiss standards for teaching mathematics, which form part of the basis for this study, include competences in mathematical practice, as outlined in the US Common Core Standards. These practices are listed in the Swiss standards in relation to mathematical content, such as numbers and quantities. It makes sense to combine mathematical practices and content because procedural and conceptual knowledge are most likely to be acquired when their presentation is combined (Rittle-Johnson & Schneider, Citation2015). Unfortunately, in the case of reasoning, corresponding tasks are not often found in Swiss (Brunner, Jullier, & Lampart, Citation2019) and US textbooks (Bieda, Ji, Drwencke, & Picard, Citation2014). Textbooks are representations of the curriculum and can be viewed as a link between the curriculum and pedagogy (Pepin & Haggarty, Citation2001). Therefore, teachers in compulsory education seldom include mathematical reasoning discourses in their lessons (Stylianides, Citation2014), making it important to offer students more experiences in mathematical reasoning in order to meet standards in all areas of mathematics, concepts, and processes.

Teachers, especially those whose instructional focus is primarily mathematical concepts and operations, are challenged when they are charged with helping students develop the skills and understanding needed for mathematical practice, such as those related to problem solving and reasoning (Schoenfeld, Citation2015). Part of that challenge is dealing productively with students’ individual approaches to problem solving—both correct and incorrect. In mathematical reasoning, there are often no straightforward ways to arrive at a solution, and students are less certain about their responses. Some students miss such certainty and seem to have low self-efficacy beliefs in mathematics (Tait-McCutcheon, Citation2008; Tanner & Jones, Citation2003). Formative assessment can be used to support students’ acquisition of mathematical reasoning skills. This method offers information on the level of a student's understanding at a point in time when the student can continue to work towards a final goal, rather than assessing what the student “knows and can do” on a final exam (Black & Wiliam, Citation2009). Providing supportive feedback, as part of formative assessment in a classroom or individually, is a central task for teachers to help develop their students’ reasoning skills (Bragg, Herbert, Loong, Vale, & Widjaja, Citation2016).

Our study covered a 10-week student-training programme that was embedded within a professional development (PD) programme related to mathematical reasoning, as part of the new standards for mathematics. This PD programme for in-service teachers in Swiss upper-primary classes included a short lecture on the use of formative feedback to sustain reasoning discourse between the teacher and student(s). Our research question was whether the perceived quality of the teachers’ feedback to students helped explain the progress shown by the students in mathematical reasoning. Research on the effects of formative feedback in real classrooms and specific subjects remains scarce (Ruiz-Primo & Min, Citation2013, p. 226). What is special about our study is that we examined these effects separately at the individual and class levels.

Mathematical reasoning

The concept of mathematical reasoning is vague, and there is no shared definition of it within the research community (Jeannotte & Kieran, Citation2017). Reasoning and argumentation are often used without distinguishing the two skills (Whitenack & Yackel, Citation2002) because they do not have well-defined boundaries (Hanna, Citation2014). Viholainen (Citation2011) sees reasoning as a process where arguments are exchanged in order to arrive at the best conclusion. Somewhat similar, Lithner (Citation2000) described reasoning as a four-step process, consisting of a problematic situation, a strategic choice, its application, and a conclusion. By following these steps, a student engages in reasoning while using argumentation to explain how all the steps are logically connected and lead to a final substantiation. One could argue that argumentation and reasoning as individual student activities both refer to the same process in mathematics (Conner, Singletary, Smith, Wagner, & Francisco, Citation2014). Jeannotte and Kieran (Citation2017) suggested a model of mathematical reasoning for schools that consists of two main aspects: structural and procedural. The structural aspect described in Lithner (Citation2000) and Toulmin (Citation2003) refers to specific steps that guide discourse, and the procedural aspect distinguishes between different but interrelated categories of thought processes, e.g. generalising, justifying, and exemplifying (Stylianides, Citation2008).

The term “reasoning” is employed in most research, but in the Swiss mathematical standards, the basis for our study, the terms argumentation and reasoning are used interchangeably. The standard for students at the end of primary school (grade six) states: “Argumentation and reasoning require the ability to verify statements and justify or falsify results using data or arguments” (Swiss Conference of Cantonal Ministers of Education (EDK), Citation2011a, p. 40). For example, a student should justify whether a mathematical claim is either generally true or true only under certain conditions. The Swiss standards are broadly conceived in accordance with the adaptive reasoning approach of Kilpatrick, Swafford, and Findell (Citation2001), and they include descriptions of the procedures and the use of representations as part of the reasoning process. However, precise distinctions of categories of the thinking process, as described in Stylianides (Citation2008), are not addressed. Similar to Kilpatrick et al. (Citation2001), we observed a high level of interaction of reasoning with other mathematical procedures, especially with problem solving. In this context, reasoning as described in Lithner (Citation2000), involved several steps in our study, such as identifying the problem, applying solutions involving representations as needed, and justifying the procedures for others in a linguistically comprehensible way.

Although formal proofs are a part of Stylianides (Citation2008) definition of reasoning, they are generally not a topic in primary and secondary school curricula, but are addressed in high school classes. Nevertheless, building reasoning skills, as part of a pre-formal proof, begins in primary school (Blum & Kirsch, Citation1991; Semadeni, Citation1984). In primary school, lessons on reasoning focus on explaining procedures, assumptions, and results, and making claims, predictions, and generalisations (Bezold, Citation2009). Algebraic reasoning in the lower grades can, for example, include exploring patterns and describing relationships (Lüken, Peter-Koop, & Kollhoff, Citation2014). Students can benefit from the openness of exploration and flexible validation rules that are typical of reasoning as a prelude to the stricter use of rules and symbols essential for constructing mathematical proofs (Durand-Guerrier, Boero, Douek, Epp, & Tanguay, Citation2011). Stein, Grover, and Henningsen (Citation1996) pointed out that developing mathematical reasoning skills involves the application of multiple-solution strategies (a part of the procedure) and multiple representations that can support an argument. Thus, mathematical procedures are involved in reasoning, but domain-specific knowledge can also be a part of mathematical reasoning. Actually, the learning of conceptual knowledge can even benefit from embedding it in procedural activities (Rittle-Johnson & Alibali, Citation2001).

Mathematical reasoning in the classroom takes place by means of oral or written problems in different mathematical content areas (Whitenack & Yackel, Citation2002). For example, Fyfe and Brown (Citation2018) and Powell, Berry, and Barnes (Citation2020) studied students’ understanding of an equal sign as a specific content area within mathematical reasoning. Wyndhamn and Säljö (Citation1997), as a second example, were interested in students reasoning on how to calculate distances. Reasoning tasks are mostly presented as word problems (Cummins, Kintsch, Reusser, & Weimer, Citation1988), meaning the problem is embedded in a short text with one or more questions (Verschaffel, Schukajlow, Star, & Van Dooren, Citation2020). This is often accompanied by illustrations containing further information. Therefore, effective teaching and learning of mathematical reasoning both contribute to and rely on the student’s language competencies (Bragg et al., Citation2016). The first difficulty children encounter when they read a task is in understanding the problem and the situation. Teachers should be aware of other language obstacles, namely receptive (reading and listening) and productive (writing and speaking) difficulties. Children at the school with a different mother tongue have more challenges with word problems than do students who grow up speaking the school’s dominant language (Kempert, Saalbach, & Hardy, Citation2011; Stanat & Christensen, Citation2006). One reason for this may lie in the linguistic structures inherent in mathematical structures and operations, for example, making sense of relational terms, such as “more than” or “less than”. Children with special needs exhibit similar difficulties (Kroesbergen & Van Luit, Citation2003) and frequently have inadequate problem-solving skills (Swanson, Citation2014). All children in a class, not only the higher performing students, should learn to reason mathematically. However, support during reasoning activities, such as making and testing conjectures, framing problems, and looking for patterns (Stylianides, Citation2008), may be differentiated based on children’s competence levels (Fyfe & Brown, Citation2018).

Teachers often struggle to notice when a student’s reasoning is sensible because they confuse reasoning with content knowledge, such as using the correct method for calculation (Melhuish, Thanheiser, & Guyot, Citation2020). When teachers introduce algebraic reasoning in primary classrooms, they do so with little experience dealing with the rich and connected elements that this exercise requires, indicating that new competencies must be developed by classroom teachers (Blanton & Kaput, Citation2005). Reasoning activities are related to pedagogical alterations, such as increased group work, writing, extended projects, and alternative forms of assessment. These changes provide opportunities for argumentative exchanges between pupils or between pupil and teacher. However, it is also about the fact that the learning path, and not only the result, should be included in the assessment. Teachers should understand that it is not sufficient to present reasoning tasks without creating an instructional environment with an emphasis on discourse. Discursive exchange is crucial for a teacher’s understanding of a student’s thought process (Brodie, Citation2010; Ginsburg, Citation2009; Sfard, Citation2001). Inadequate time for extended problem-solving processes also hinders teachers’ application of contextualised investigations and activities that motivate groups of students to explore mathematical concepts and applications, and engage in reflective writing and talking about their findings (Keiser & Lambdin, Citation1996). Self-instruction methods seem to be as effective as direct instruction in problem-solving tasks (Kroesbergen & Van Luit, Citation2003), but strong self-regulation skills are important and must be developed simultaneously. In more student-centred lesson sequences, teachers can provide support through metacognitive scaffolding, such as using teacher prompts as part of formative feedback (Ruiz-Primo & Min, Citation2013; Turner, Citation2014). A teacher coaching students toward a specific goal is particularly important for mathematical reasoning, as children can easily become “stuck” in the problem-solving process and their perseverance can be challenged (Barnes, Citation2019). Feedback is one way of providing such guidance that can potentially boost the efficacy of the reasoning process.

Formative feedback and mediating effects

An effective feedback process should define, clarify, or articulate the evidence of learning and/or provide criteria for successfully meeting an intended goal (Ruiz-Primo & Min, Citation2013). Mathematically substantive feedback supports students with guiding information to improve their responses to mathematical tasks (Webb, Citation2004). This includes having students actively involved in the feedback process. A feedback situation is not discrete; rather, it combines looking for evidence of a student’s learning during ongoing interactions and communicating with students in on-the-fly situations (Ruiz-Primo & Min, Citation2013). Feedback that promotes mindful reflection of an ongoing process seems to be most effective in helping students improve and tends to result in the highest gains in performance (Hattie & Timperley, Citation2007). Such comments can prompt students to make meaningful and thoughtful revisions to their work on their way to the final product. Bangert-Drowns, Kulik, Kulik, and Morgan (Citation1991) reported that although feedback was positively associated with better performance in most situations, students’ work did not improve if the feedback failed to contain basic details to help them find out where they are (in their learning), where they should be going, and how they can get there. One study showed that when feedback was too complex, the effectiveness of the information was reduced (Shute, Citation2008). The study by Fyfe, Rittle-Johnson, and DeCaro (Citation2012) of mathematical problem solving found that children with little prior knowledge needed more corrective feedback (on the task), whereas those with some prior knowledge benefited more from unguided exploration. In combination with certain mathematical content, feedback can have negative effects on reasoning if children have substantial prior knowledge (Fyfe & Brown, Citation2018). Hence, as the authors concluded, the most effective feedback will not be the same for all learners, but will depend on specific learner characteristics. Fyfe and Brown’s (Citation2018) meta-analysis found that feedback had a much larger effect on procedural outcomes than on conceptual ones. Procedural knowledge is the competence to execute actions and apply strategies to solve problems, including those in which mathematical reasoning is necessary, whereas conceptual knowledge entails understanding a concept and related principles. However, learning procedural and conceptual knowledge often go hand-in-hand (Rittle-Johnson & Schneider, Citation2015). Different types of feedback can influence mathematical practices. For instance, it is less effective for a teacher to provide right/wrong feedback about mathematical practices, than it is to engage in a more substantial feedback sequence, discussing approaches to solving the problem. Research has shown that both are needed, but the best support is provided when the reasoning process is based on specific situations (Fujita, Jones, & Miyazaki, Citation2018).

In keeping with the four levels of feedback of Hattie and Timperley (Citation2007), domain-specific feedback can address simple errors in the choice of reasoning, for example, Level 1 task feedback: “You made a mistake in the calculation”. Feedback could also involve cueing to guide the search for a better solution or relationship without stating directly the correct answer, for example, Level 2 process feedback: “What step is needed next in the calculation?” (Fujita et al., Citation2018). Feedback may include comments to students about how they are self-monitoring their progress towards a goal, for example, Level 3 self-regulation feedback: “Have you checked all the steps needed for the calculation?” Last, feedback at the self level consists of feedback responses based on the student’s character traits, for example, Level 4 self feedback: “You are an excellent mathematician”. Feedback at the task level is viewed as less effective than feedback at the process level, when it is necessary to provide information to help reduce a learning gap. Feedback at the self level is even less effective because it is unrelated to the task, whereas feedback at the self-regulation level is most effective but occurs rarely (Dirkx, Joosten-ten Brinke, Arts, & van Diggelen, Citation2019; Fujita et al., Citation2018). Self-regulation in mathematical reasoning includes reflection on the information generated and the value of the processes and strategies employed. It facilitates progression from conducting trials to formulating conjectures, generalisations, and convincing arguments (Barnes, Citation2019).

Teachers often encounter difficulty with interactive, on-the-fly assessment practices for complex mathematical processes, such as reasoning (Smit, Hess, Bachmann, Blum, & Birri, Citation2019). PD programmes for formative assessment can raise teachers’ feedback knowledge, which, in turn, can generate better formative feedback for tasks related to mathematical practices (Schütze, Rakoczy, Klieme, Besser, & Leiss, Citation2017). Ideally, formative feedback should contribute to students’ internalisation of performance criteria, so that they can meet the teacher’s expectations and engage in meaningful self-assessment (Webb, Citation2004). Such feedback can also enhance students’ self-efficacy beliefs because they can identify mistakes in mathematics and revise them accordingly (Tanner & Jones, Citation2003).

Hence, feedback should be a positive learning experience for students, which enhances their mathematical self-efficacy beliefs. Teachers can do the following to build students’ self-efficacy: (a) express belief in their successful completion of the task; (b) help students set realistic goals; (c) provide feedback on how students can demonstrate the desired performance; (d) give students multiple chances to work on a solution to a problem; (e) provide missing knowledge; and (f) refer students to classmates at a similar level, who have achieved the goal (Hughes, Citation2010).

It is theoretically posited that students’ mathematical self-efficacy can affect mathematical achievement by influencing certain behavioural and psychological processes (Bandura, Citation1997). Self-efficacy affects the goals a student sets and their commitment to them. It influences the learner’s decision-making at intersections along a path to reach those goals, and the persistence to continue on that path (Bandura, Citation1993). In upper primary school classes, students have a well-developed sense of self-efficacy that is subject-specific and stems from their home, school, and peer experiences (Joët, Usher, & Bressoux, Citation2011; Schunk & Pajares, Citation2005). Peers provide vicarious experiences and are important resources for feedback (Bandura, Citation1977). Students with higher self-efficacy beliefs also exhibit higher levels of metacognitive performance, and are more persistent in a learning task than those with lower self-efficacy beliefs (Pintrich & De Groot, Citation1990). Classroom observations have shown that teacher feedback that comforts students, often in combination with praise, can help those with low self-efficacy persist in their tasks (Eriksson, Björklund Boistrup, & Thornberg, Citation2017). Urdan and Turner (Citation2007) noted that most research on self-efficacy and related beliefs have not been conducted in the classroom, and that the application of recommendations from it might not produce the expected results produced by experiments. The following are two examples of classroom situations that differ from the experimental situation: (1) students are mostly assessed on normative expectations and not on individual progress; and (2) teachers rarely discuss goals in the classroom.

To progress, students need continued confidence in their ability to master a mathematical task; they must believe in their own efficacy (Kilpatrick et al., Citation2001). Empirical research has established that mathematical self-efficacy is positively associated with mathematical performance (Pajares & Graham, Citation1999). Self-efficacy mediates the effect of the quality of instruction on mathematical achievement (Li, Liu, Zhang, & Liu, Citation2020). Fast et al. (Citation2010) reported that upper elementary students who perceived their classroom environment as more caring, challenging, and mastery oriented had significantly higher levels of mathematical self-efficacy, which, in turn, positively predicted their mathematical performance. Finally, formative feedback, which is an important component of quality instruction, seems to support students’ achievement mediated via mathematical self-efficacy (Rakoczy et al., Citation2019). Rakoczy et al. (Citation2019) and Chan and Lam (Citation2010) concluded that formative or more process-oriented feedback can be interpreted as a kind of social persuasion that fosters self-efficacy in keeping with Schunk (Citation1995). In the classroom, teachers and peers are immediate sources of social persuasion for students. The aforementioned studies underscore the important role of students’ self-efficacy as a mediator of the relationship between classroom features and students’ mathematical achievement.

Present study: research questions and hypotheses

In this study, we investigated whether and how teachers’ formative feedback perceived by students relates to students’ mathematical reasoning outcomes. Our study had high ecological validity as it was embedded within the teachers’ daily practice, although it was based on our lesson plans. We focused on oral feedback, which has received less attention in the research, but is more typical of teachers’ practices (Hargreaves, McCallum, & Gipps, Citation2000). Drawing on theoretical insights and results from a previous study (Smit, Bachmann, Blum, Birri, & Hess, Citation2017), we expected an effect of formative feedback on reasoning outcomes that was mediated by students’ mathematical self-efficacy. This effect at the individual/student and clustered/class levels was investigated. Hence, we were able to comment on the differences in reasoning outcomes between classes and within a class.

Hypothesis 1: We expected the perceived formative feedback of a student to have an indirect effect on her/his mathematical reasoning outcomes via mediation by the student’s perceived mathematical self-efficacy.

Hypothesis 2: We expected the perceived formative feedback of a class of students to have an indirect effect on the class’s mathematical reasoning outcomes via mediation by the class’s perceived mathematical self-efficacy.

Methods

Design

The present observational study is part of a project titled, “Feedback for mathematical reasoning”, which began in 2018, and is being conducted by two teacher-education universities in Switzerland (St. Gallen and Zug). Some of the data used in this study were derived from a prior project titled, “Learning with rubrics”, which was based on an identical research design (Smit et al., Citation2017). A few teachers allowed us to videotape lessons during the earlier project, but there were not enough lessons for a representative sample. Therefore, one of the aims of the second project was to record an adequate number of lessons for a quantitative video analysis. In order to merge the data from the two projects, we left the design unchanged, but formulated new research questions, such as those presented in this study. To answer our research questions, we used a survey and a longitudinal pre-test post-test design. We measured the students’ mathematical reasoning skills at the beginning of the 10 week-implementation phase (T1). During the same week, the students completed a questionnaire on attitudes and related aspects of classroom learning, such as feedback support. At the end of the implementation phase, we measured the students’ outcomes, attitudes, and perceptions of classroom learning again (T2).

Participants

Teachers and their classes were recruited with the help of an advertisement in the local teachers’ journal, in addition to personal requests. After the first round of our project in 2015, with 45 participants, we repeated the recruitment in 2019 with another 28 teachers. Thus, our sample for this study consisted of 73 full-time teachers and their classes from two regions in central and eastern Switzerland. Of these, 44 were women, 29 were men, their mean age was 40 years, and their mean length of service was 14 years. Twenty-five teachers managed a 5th grade class, 22 were responsible for a 6th grade class, and the remaining 26 teachers taught multi-grade classes. Six of them even involved their 4th grade students in the project. During the project period, two teachers and their classes withdrew from the project because the heavy workload of the teachers involved daily classroom work. Thus, we obtained 71 of 73 class datasets at two points in time. There were 57 4th graders, 669 5th graders, and 535 6th graders (1261 students in all). Approximately 52% of the students were boys and 48% were girls. Almost all the students were white and identified as Swiss. The official school language in the region of the study was German; however, German was not the mother tongue for 26% of the students.

Measures

Questionnaire

Data on the students’ attitudes and classroom perceptions were collected through a questionnaire. A set of items for each of the two measurements was distributed twice to the students. Background variables included information on the students’ nationality and gender. All the items are presented in the Appendix. The formative-feedback scales were constructed based on existing items (Brown, Harris, & Harnett, Citation2012; Smit, Citation2009) and adapted from the literature (Hargreaves et al., Citation2000; Hattie & Timperley, Citation2007). We understood formative feedback to be part of an episode of formative assessment and expected the students to participate in a dialogue consistent with the theory of Ruiz-Primo and Min (Citation2013). We measured the effectiveness of the teacher’s formative feedback in accordance with the three most effective levels from the model devised by Hattie and Timperley (Citation2007): task, process, and self-regulation. According to Hattie and Clarke (Citation2019) there can be powerful interactive effects between feedback aimed at different aspects. The internal consistency of the formative feedback scale was assessed and the Cronbach’s α was .70 for the pre-test and .76 for the post-test. The items on self-efficacy, which were adapted from the study by Berger and Karabenick (Citation2011), were originally taken from the study by Pintrich and De Groot (Citation1990). We adapted them to solve word problems and chose the term “word problem” because it was understandable to all students, which was not yet the case for the term “reasoning at time T1”. Cronbach’s α was .75 for the pre-test and .83 for the post-test.

Mathematical reasoning test

All items measuring mathematical reasoning were either adapted from other standardised tests or were developed for this study (see examples below). The items were aligned with the national basic competences of Switzerland (Swiss Conference of Cantonal Ministers of Education (EDK), Citation2011b). The competences are described in a grid, with the content-related and the action-related aspects representing the two axes. Our reasoning items referred to three of the four content areas of grades 4-6: number and variable, functional relationships, and sizes and measures. Through the reference to various contents, the teachers’ fear of not being able to cover all the content of the curriculum during the intervention period was addressed. As the items were open-ended, we expected a student to work for approximately 35–40 min on the 10 items during each session. All of the test items were rated on four levels of competence based on a rubric and a detailed manual with a description of each level.

Sample items and manual excerpts

The procedure and reliability were tested in our pilot project (Smit & Birri, Citation2014). Satisfactory inter-rater reliability was obtained for each rating team: Kappa > .70. The complete test battery consisted of 14 items, which we distributed in two testlets of 10 items each. Six items were used repeatedly to function as anchors for the Item Response Theory (IRT) calibration, and we used the weighted least square mean and variance adjusted estimator for ordinal data. Upon completing the IRT analyses, each student’s final scores, based on Bayesian plausible values (Von Davier, Gonzalez, & Mislevy, Citation2009), were computed for the reasoning tests. Plausible values were also calculated for students with missing data. Mplus provided imputed data sets using Rubin’s method (Rubin, Citation2004).

Procedure

Separate workshops were held at each teacher university by the same team members. To clarify the mathematical standard, theoretical inputs on the practice of reasoning and its instructions were presented. The teachers’ experiences of working on the reasoning tasks themselves enabled a fruitful discussion of the implications for teaching. The implementation phase, which followed the workshop, consisted of 10 weeks of training, with one lesson on practising mathematical reasoning delivered each week. All teachers had to follow a strict script in the form of a detailed lesson plan. However, during our visits for the video recordings, we found that the teachers’ interpretations of the lesson plans resulted in a reasonable range of local implementations. The lesson plans had a socio-constructivist orientation, and collaborative group work or peer work (e.g. placemat) to enhance student discourse was part of almost every lesson (Knudsen, Lara-Meloy, Stevens Stallworth, & Wise Rutstein, Citation2014). In the first part of the script, the teachers were asked to discuss the quality of different reasoning examples with the class to clarify the targets. In the implementation, the students used peer- and self-assessments of their work. In the last weeks, the teachers were expected to give each student feedback on their degree of competence in mathematical reasoning. The teachers received a set of exercises for mathematical reasoning, including possible solutions. These exercises covered numerical sequences, quantities, and units, the decimal system, basic operations, proportions, and estimations. Some of these tasks were purely mathematical whereas others were related to reality and were contextualised.

Although not part of our research question, we provided the following observations as background information for the interpretation of our results. After implementation, the second workshop evaluated the outcomes to identify successes and difficulties encountered. The teachers prepared a flip chart with reflections and presented examples of student solutions and additional teaching materials.

Some of the findings are as follows. The children found it difficult to talk about their procedures or to transfer the situational context to a mathematical one in order to explore possible reasoning. It was not easy for them to transfer the mathematical representations they were shown to other similar tasks. The children found it exhausting to provide reasons. The teachers tried to ritualise and standardise the procedure to provide security for the students to be able to tackle the tasks. Group work was considered beneficial for learning to reason, and every student in the group helped find solutions for the reasoning tasks assigned. Finally, the students realised that doing mathematics was more than just “getting it right or wrong”. There could be different arguments that needed to be discussed.

Data analysis

We modelled the mediating effect of self-efficacy beliefs on the relationship of formative feedback and mathematical reasoning using the path-analysis framework. We used Mplus 8.7 to compute all of the models (Muthén & Muthén, Citation2017). The units of analysis included teachers and students nested within classrooms; therefore, a multilevel approach was used (Bauer, Preacher, & Gil, Citation2006). This procedure assumes that teachers influence students, and individual students, in turn, influence the properties of the class. Consequently, variables may be defined at the student and the class/teacher levels. We included gender, age, and language spoken at home (German or not German) as the control variables in our analysis.

The questionnaire the students completed on their perceptions of feedback quality and self-concept contained missing values (between 2.4% and 8.3%). Given that these missing values were not due to the study’s design, we assumed that they occurred randomly. Thus, we initially used the full-information-maximum-likelihood procedure as a model-based treatment for missing data (Enders, Citation2010). In the second step, we switched to Bayesian analysis because greater precision and better estimates of the sample variance could be expected (Hox, Moerbeek, & van de Schoot, Citation2018). Bayesian analysis is an attractive alternative to maximum likelihood estimation, especially for small sample sizes and multi-level variable models with random effects (Muthén, Citation2010; Muthén & Asparouhov, Citation2012). Priors can help optimise small variance parameters. Mplus uses a series of default priors, which we did not alter due to missing information from other studies. Posterior distributions were checked for plausibility. Estimations of different iterations were performed to determine convergence, and the potential scale reduction factor values were checked as well (van de Schoot & Depaoli, Citation2014). If a factor is near 1.0, two Markov chains are said to have converged, meaning that trustworthy parameters have been produced. We also examined trace plots to control “burn in”. A 95% confidence interval was produced for the difference in the f statistic between the real and replicated data (Muthén & Asparouhov, Citation2012).

Results

Descriptive statistics

Before reporting the results of the path analysis, we provide an overview of the correlations, means, and standard deviations of the scales used in the study (). The score for mathematical reasoning at T1 was fixed at 0 (Var = 1), and the plausible values for persons’ parameters were calculated from a standard normal distribution. Hence, at T1 the mean was 0, -.02 with rounding errors, and the SD was .66 at both measurement time points. At T2, the mean mathematical reasoning score was higher: M = .60. On average, the students reported fairly good levels of mathematical self-efficacy (M = 3.99 and 4.08, respectively, at T1 at T2, on a 6-point scale). However, students’ scores were found to be variable (SD = .72 and .77), indicating the full use of the scale. Perceptions of teachers’ formative feedback were also fairly high, (M = 3.85 at T1), but dropped slightly over time (M = 3.63 at T2).

Table 1. Descriptive statistics and correlations of students’ perceptions and mathematical reasoning competence.

Zero-order correlations at the student level showed a positive relationship between mathematical reasoning outcomes and perception of self-efficacy (T1 r = .34 and T2 r = .35). Self-efficacy and formative feedback were positively correlated, though the correlations were a bit weaker (T1 r = .16 and T2 r = .21). At the class level, we found strong correlations between reasoning outcomes and perception of self-efficacy only at T2 (T1 r = -.10 and T2 r = .52). However, the relationship between self-efficacy and formative feedback was quite strong at both times (T1 r = .47 and T2 r = .63). also shows the correlations for each variable over time, from which conclusions can be drawn about the stability of the items.

Longitudinal mediation model

Results of the first analysis of variance for the development of the students’ reasoning outcomes showed that the mean outcomes (M = .60) were significantly higher at T2 (F(1, 1221) = 3629, p < .001) than at T1 (M = -.02), indicating improved mathematical reasoning skills over the implementation phase. The development can alternatively be analysed by conducting a separate regression for each class. Intra-correlation coefficients (ICCs) showed that approximately 15% of the variance in mathematical reasoning was due to the class. Hence, it is reasonable to explore explanations for differences in the development separately at the student and class levels. The ICCs for formative feedback were substantially high, whereas those for the perception of self-efficacy indicated that there was little variation between the classes (see ). An ICC of .04 for mathematical self-efficacy was consistent with the results of Joët et al. (Citation2011), but it is difficult to interpret.

We checked for contextual influences on the development of mathematical reasoning. However, gender, age, and language spoken at home had no effect on mathematical reasoning at T2. The lack of a gender effect may have been expected as boys typically outperform girls in mathematical reasoning tests, although girls tend to do better on word problems, which require reading (Liu, Wilson, & Paek, Citation2008).

Next, we established a longitudinal mediation model for the student and class levels, which were combined in a multi-level path model. Although a three-wave model would be optimal to consider the temporal lags, a longitudinal lagged effect can be modelled for each stage using only two waves of data (Cole & Maxwell, Citation2003). The product of the two paths ‘feedback T1 self-efficacy T2', and ‘self-efficacy T1 math reasoning T2' formed the indirect effect. We were also interested in the more immediate effects of feedback at T2 on math reasoning at T2, which was mediated by self-efficacy at T2 because many classes had little experience with math reasoning.

The multilevel model was estimated with aggregated manifest variable indicators (means calculated from parcels of items) to reduce the number of parameters calculated in the complex models, owing to the sample size (Boivard & Koziol, Citation2012). The above-described analysis with manifest variables assumed measurement invariance over time, whereas the use of latent variables with multiple indicators would have enabled testing for such measurement invariance. This is a limitation of the analysis.

We first calculated the model using Mplus 8 with the maximum likelihood estimation: χ2/df = .75 (1.92/4); CFI/TLI = 1.00; RMSEA = 0.000; SRMRw = 0.00; SRMRb = 0.04. The sample size for the between-level analysis was rather small; therefore, we switched to Bayesian estimation, which allows for better model estimation because large-sample theory is not needed (Muthén & Asparouhov, Citation2012). Bayesian estimation with a Gibbs-Sampling Algorithm was used to calculate the multi-level model in .

Figure 1. Two-wave longitudinal multilevel model of the standardised effects of perceived formative feedback on mathematical reasoning competence mediated by student self-efficacy perceptions. Nbetween = 71, Nwithin = 1261; **p < .01, *p < .05.

Figure 1. Two-wave longitudinal multilevel model of the standardised effects of perceived formative feedback on mathematical reasoning competence mediated by student self-efficacy perceptions. Nbetween = 71, Nwithin = 1261; **p < .01, *p < .05.

In response to our first research question, the paths presented in show that, at the individual (i.e. student) level, there was no significant effect of perceived formative feedback at T2 on mathematical reasoning at T2, and no mediation effect of perceived self-efficacy at T2. There was, however, a direct effect of formative feedback on self-efficacy (β = .19). All autoregressive paths (T1-T2) were also significant, with self-efficacy being the most dynamic variable over time, whereas mathematical reasoning was the most stable. With respect to the cross-lagged effects, the only path with a considerable beta effect size was between self-efficacy at T1 and mathematical reasoning at T2 (β = .23), while a second significant path from mathematical reasoning at T1 predicting self-efficacy at T2 reached significance but the beta effect size was very small (β = .04). Additional comments on these results follow in the Discussion section.

In the next step, we switched to the class-level analysis, which appeared to have different results from the student part of the model in . With respect to the second research question, the class-level model showed an indirect effect of perceived formative feedback at T2 on mathematical reasoning at T2, which was mediated by self-efficacy at T2. Formative feedback significantly predicted self-efficacy in the classes (β = .62), and self-efficacy, in turn, predicted mathematical reasoning (β = .49) at the end of the implementation phase. The direct effect of formative feedback (T2) on mathematical reasoning (T2) was not significant. Two autoregressive paths indicated stability: formative assessment (β = .72) and mathematical reasoning (β = .83). For self-efficacy, however, the regression coefficient was negative (β = -.39), although not significant, which indicated a reversal of the mean class perceptions: those classes with perceptions of self-efficacy that were rather high had low perceptions of self-efficacy at the end of the implementation phase and vice versa. Finally, all the cross-lagged paths had small regression coefficients and none of them was significant. Overall, the longitudinal multi-level model explained 84% of the variance in mathematical reasoning competence at the student level, and 89% of the variance at the class level at the end of the implementation phase.

Discussion and conclusions

Summary

All classes exhibited improved reasoning competence. Thus, a short training period of 10 weeks, including a final test, led to a gain in skills appropriate for mathematical practice. Did formative feedback, as an instructional tool, support this development? We only confirmed our second hypothesis. Formative feedback (T2) predicted mathematical reasoning competence (T2), which was mediated by self-efficacy (T2) at the class level. At the individual level, formative feedback predicted self-efficacy, but not mathematical reasoning competence, which is contrary to our first hypothesis. However, at the individual level, self-efficacy (T2) was also predicted by previous mathematical reasoning competence (T1). Thus, an individual student’s self-efficacy beliefs possibly arose from two sources: experience from past performance on examinations and more immediate feedback from the teacher. These sources seemed to take time to be effective for learning, as individual self-efficacy beliefs appeared to be quite stable. Nevertheless, returning to the second hypothesis, in classrooms where teachers provided more formative feedback indirectly – via self-efficacy beliefs – students showed a higher mean class achievement in mathematical reasoning. Hence, how a teacher supports a class in reasoning with word problems plays a role in students’ class achievement in mathematical reasoning.

These results are partly in line with Li et al. (Citation2020) and Fast et al. (Citation2010), who found that the effects of classroom quality were mediated by mathematical self-efficacy on both individual and class-achievement levels. Similar to our results, and related to formative feedback, Rakoczy et al. (Citation2019) observed small non-significant effects at the individual level that were mediated by self-efficacy alone. Hence, in the next section, it is relevant to discuss the effectiveness of the formative feedback specific for each level.

Interpretations

While our results showed that formative feedback is helpful during the reasoning discourse in a class, the question of why we failed to find a mediated relationship between formative feedback and reasoning competence in our model at the individual level remains unanswered. The model demonstrated the relationship but it was delayed in time. First, self-efficacy beliefs need to grow for formative feedback to become effective.

Possible explanations for different individual relationships between formative assessment and mathematical reasoning will be discussed in more detail because this result was unexpected, and an overall effect on the class level was found, as expected. Our focus in explaining these relationships is the individual learner’s characteristics. According to Schoenfeld (Citation1985), the reasoning a student applies while trying to solve a problem depends not only on the task’s characteristics, but on the relationship between the task and the problem-solver acting in a social context. Thus, the characteristics of the student and the student’s class will influence the quality of the reasoning applied to the task. As part of the analysis preceding the construction of a final multi-level model, some of the students’ characteristics (the control variables gender, age, and language spoken at home) were tested, but had no effect. In the case of gender, this finding could be explained by the task’s characteristics where reading competence (i.e. understanding the semantic relations between the given and unknown quantities of the problem) could compensate for the often-reported lower mathematical performance of girls (Lubienski, Robinson, Crane, & Ganley, Citation2013). We did not find any age differences. A somewhat similar result was found by Hilbert, Bruckmaier, Binder, Krauss, and Bühner (Citation2019), who reported no significant differences among second, third, or fourth graders in mathematical reasoning. We examined differences in the students’ mother tongue, as some tasks were situated in particular contexts and required reading competence. We did not find any effects of this control variable, perhaps because the children with a different mother tongue had minimal proficiency in German and were excluded from the reasoning test by the teacher. In the following section, we consider learner characteristics based on the student’s general mathematical knowledge level.

Fyfe and Brown (Citation2018) showed that the effects of feedback depend on prior knowledge. Their results could lead to a more complex understanding of the interplay between teacher feedback and reasoning competence because they indicate that these effects depend on the level of mathematical practice a student already demonstrates. Based on their results, Fyfe and Brown suggested that learners with prior knowledge should be allowed to practice without feedback and engage in deep, task-relevant processing without interruption. However, we think that teachers allow this anyway. In most classrooms, one will probably encounter teachers spending more time with lower achieving students than they do with those who have more knowledge and skills. Our classroom video analysis, which is in progress, may be able to verify this statement. Nevertheless, prior knowledge, as another intervening variable, should be investigated in future research on formative feedback in more detail.

Based on notes from some of the spontaneous conversations with the teachers in our final workshop, we think some students’ difficulties in mathematical reasoning were not only the result of lack of mathematical knowledge. Lack of awareness about activating and regulating their knowledge and justifying their actions and conclusions might also have contributed to their difficulties (De Corte, Verschaffel, & Op't Eynde, Citation2000; Schoenfeld, Citation2012). Knudsen et al. (Citation2014) suggested fostering students’ reasoning creativity by asking them to think beyond what they already know. This activity should produce conjectures that can be discussed as either true or false, individually and in groups. Building on other people’s ideas and discussing them in the classroom is the basis for the social process of reasoning. Students should give each other feedback instead of merely contributing individual statements. It is difficult for teachers and students to recognise that statements that turn out to be false are beneficial for learning because they have a potential role to play in the development of reasoning. Recognising this potential requires a teacher to have a high level of flexible content knowledge. The practice of teaching reasoning demands that teachers are explicit about what is expected of students and what makes for good reasoning (Knudsen et al., Citation2014). Transparency of goals is an important feature of formative assessment, and teacher feedback should always be related to goals (Hattie & Gan, Citation2011).

It may be worthwhile to examine the differential effects on students with high and low efficacy, although our model found a relationship between feedback and self-efficacy for all students. Hattie and Clarke (Citation2019) cautioned that students with low self-efficacy are in danger of losing motivation when they receive discomforting feedback. These students attribute negative feedback to their perceived abilities and give up working on reasoning tasks. A teacher’s feedback should be part of a longer episode so that an unfavourable attribution can be adjusted within the student-teacher dialogue (Hargreaves, Citation2013).

Our measurements of formative feedback address all three questions suggested by Hattie and Timperley (Citation2007): Where am I going? How am I going? Where to go next? Our results showed that our construct of formative feedback and the related items were able to predict – via self-efficacy – differences in class achievement in mathematical reasoning. The students’ perceptions of teachers’ formative feedback were fairly high, but decreased slightly over time from T1 to T2. This finding could indicate that teachers encountered greater difficulty supporting their students in solving reasoning problems, than they did during lessons with topics that were more traditional, such as numbers and operations. As we stated before, for many teachers, the application of tasks requiring mathematical reasoning was new. Perry, Davies, and Brady (Citation2020) indicated that video clubs may be a way to help develop teachers’ feedback competencies through collaborative reviews of lessons in order to improve the support provided to students while engaging in reasoning activities.

Limitations

Some methodological limitations of the study should be considered when interpreting the results. One limitation is that the teacher-training programme was rather short. We had to limit the project’s period in favour of teachers who were under pressure to convey the standard subject content for the school year. It would be interesting to know whether longer training in mathematical reasoning, for example, for one year, would yield different effects on the multilevel model. Wolbers, Dostal, Skerrit, and Stephenson (Citation2017) showed that the duration of a PD programme was related more to a change in the teacher, whereas the impact on students’ learning remained unclear.

Pupils’ perceptions are subject to distortions. There is also the question of how students can assess the teacher’s quality of feedback in a reliable and valid manner. Van der Scheer, Bijlsma, and Glas (Citation2019) reported reliable values for primary-school pupils’ perceptions of classroom-related teacher actions. In a previous study, we calculated the agreement between teachers’ and students’ perceptions of formative feedback, and the association (r = 0.40) was satisfactory (Smit et al., Citation2017). Furthermore, there may be a mix of students’ views on mathematical instruction, in general, and on reasoning in particular. We do not know whether the students answered our survey questions with reasoning tasks in mind, or to what degree their ratings were related to mathematics in the classroom, in general. It is reasonable to assume that the teachers’ feedback support, in general, did not differ substantially from the feedback support during lessons pertaining to mathematical reasoning. A teacher may be more challenged to provide helpful feedback during mathematical reasoning lessons. Our instrument to measure formative feedback was sufficiently sensitive to predict class variance; however, another methodological limitation may be that the measures should be more focused on the students’ progress over time. Students need feedback at all levels, but not at the same time, but adapted for the current step of the problem-solving process (Hattie & Gan, Citation2011). It is possible that after the first round of becoming acquainted with mathematical reasoning, a shift from task-related feedback to self-regulated feedback could be more effective. Measurements of perceived feedback during training could be taken more frequently to investigate this possibility.

Finally, our results were derived from an observational study; thus, the research does not permit making causal inferences. Future research should use an experimental, control-group design. Additional observations in the field could have provided a clearer picture of what happened in the classroom. Our study included a video recording of one lesson in each class. We will be able to add information from our classroom observations to our quantitative data in a later phase of our research, which should allow for further interpretation.

Conclusions

Models of thinking and reasoning should be dynamic and consider the full system in which knowledge change occurs (Fyfe & Brown, Citation2018). This study of change in primary students’ reasoning competence after a 10-week training programme pertains to three factors: learner’s characteristics (self-efficacy beliefs), learner’s achievement (in mathematical reasoning), and learning tasks, including feedback supported classroom instruction. Based on our results, recommendations for teaching reasoning should consider the dynamic interplay of these three factors. Teachers must develop rich instructions for discourse and present open-reasoning tasks for students with different levels of knowledge and practice skills. These instructions should involve the teacher motivating and guiding students by providing meaningful feedback that is particular to each individual’s zone of proximal development (Vygotsky, Citation1978). Teachers should try to align their feedback to the individual learning level in order to support the students’ best reasoning attempts. There does not seem to be an overall “best” feedback for every student, but providing feedback generally is beneficial for learning.

Teachers should be aware that feedback affects a learner’s self-efficacy beliefs and reasoning competence at the same time. If feedback has a negative impact on students’ self-efficacy beliefs, their motivation to persevere with the problem may be endangered (Pintrich & De Groot, Citation1990).

A fruitful reasoning discourse in class may be attributed to positive feedback practice, but for some students, positive results in the growth of reasoning competence may take time. To develop students’ reasoning and subsequent success in higher grades, the teacher’s support for student learning has to stay tuned to a crucial factor (Hsieh, Horng, & Shy, Citation2011). Feedback from our participating teachers indicated that students showed a decrease in endurance over time, probably because their curiosity waned. This finding shows that a core aim of mathematics teachers is to motivate students to persevere with the reasoning tasks, as they require more persistence before success is achieved. Students need plenty of time to work on reasoning tasks and to practice the skills of discourse to become better communicators and mathematicians (Rothermel Rawding & Wills, Citation2012). These conversations can increase students’ confidence in their communication skills and enhance self-efficacy beliefs for mathematical practices, such as reasoning. Our results indicate that teachers’ formative feedback is a key element in instructional conversations about, for example, mathematical patterns or claims in class, and between teachers and students in on-the-fly learning situations.

Acknowledgements

The present project was funded by the Swiss National Science Foundation (Project no. 100019_179230).

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • Bandura, A. (1977). Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review, 84(2), 191–215. doi:10.1037/0033-295X.84.2.191
  • Bandura, A. (1993). Perceived self-efficacy in cognitive development and functioning. Educational Psychologist, 28(2), 117–148.
  • Bandura, A. (1997). Self-efficacy: The exercise of control. New York: W.H. Freeman and Company.
  • Bangert-Drowns, R. L., Kulik, C.-L. C., Kulik, J. A., & Morgan, M. (1991). The instructional effect of feedback in test-like events. Review of Educational Research, 61(2), 213–238.
  • Barnes, A. (2019). Perseverance in mathematical reasoning: The role of children’s conative focus in the productive interplay between cognition and affect. Research in Mathematics Education, 21(3), 271–294. doi:10.1080/14794802.2019.1590229
  • Bauer, D. J., Preacher, K. J., & Gil, K. M. (2006). Conceptualizing and testing random indirect effects and moderated mediation in multilevel models: New procedures and recommendations. Psychological Methods, 11(2), 142.
  • Berger, J.-L., & Karabenick, S. A. (2011). Motivation and students’ use of learning strategies: Evidence of unidirectional effects in mathematics classrooms. Learning and Instruction, 21(3), 416–428. doi:10.1016/j.learninstruc.2010.06.002
  • Bezold, A. (2009). Förderung von Argumentationskompetenzen durch selbstdifferenzierende Lernangebote. Eine Studie im Mathematikunterricht der Grundschule. Kovac.
  • Bieda, K. N., Ji, X., Drwencke, J., & Picard, A. (2014). Reasoning-and-proving opportunities in elementary mathematics textbooks. International Journal of Educational Research, 64, 71–80. doi:10.1016/j.ijer.2013.06.005
  • Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability (Formerly: Journal of Personnel Evaluation in Education), 21(1), 5–31. doi:10.1007/s11092-008-9068-5
  • Blanton, M. L., & Kaput, J. J. (2005). Characterizing a classroom practice that promotes algebraic reasoning. Journal for Research in Mathematics Education, 36(5), 412–446.
  • Blum, W., & Kirsch, A. (1991). Preformal proving: Examples and reflections. Educational Studies in Mathematics, 22(2), 183–203. doi:10.2307/3482408
  • Boivard, J. A., & Koziol, N. A. (2012). Measurement models for ordered-categorical indicators. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (pp. 495–511). New York, NY: The Guildford Press.
  • Bragg, L. A., Herbert, S., Loong, E. Y.-K., Vale, C., & Widjaja, W. (2016). Primary teachers notice the impact of language on children’s mathematical reasoning. Mathematics Education Research Journal, 28(4), 523–544. doi:10.1007/s13394-016-0178-y
  • Brodie, K. (2010). Teaching mathematical reasoning in secondary school classrooms. New York: Springer.
  • Brown, G. T. L., Harris, L. R., & Harnett, J. (2012). Teacher beliefs about feedback within an assessment for learning environment: Endorsement of improved learning over student well-being. Teaching and Teacher Education, 28(7), 968–978. doi:10.1016/j.tate.2012.05.003
  • Brunner, E., Jullier, R., & Lampart, J. (2019). Aufgabenangebot zum mathematischen Begründen in je zwei aktuellen Mathematikbüchern für die fünfte bzw. achte Klasse. Swiss Journal of Educational Research, 41(3), 647–664.
  • Chan, J. C. Y., & Lam, S.-F. (2010). Effects of different evaluative feedback on students’ self-efficacy in learning. Instructional Science, 38(1), 37–58. doi:10.1007/s11251-008-9077-2
  • Cole, D. A., & Maxwell, S. E. (2003). Testing mediational models with longitudinal data: Questions and tips in the use of structural equation modeling. Journal of Abnormal Psychology, 112(4), 558.
  • Conner, A., Singletary, L. M., Smith, R. C., Wagner, P. A., & Francisco, R. T. (2014). Identifying kinds of reasoning in collective argumentation. Mathematical Thinking and Learning, 16(3), 181–200.
  • Cummins, D. D., Kintsch, W., Reusser, K., & Weimer, R. (1988). The role of understanding in solving word problems. Cognitive Psychology, 20(4), 405–438. doi:10.1016/0010-0285(88)90011-4
  • De Corte, E., Verschaffel, L., & Op't Eynde, P. (2000). Self-regulation: A characteristic and a goal of mathematics education. In M. Boekaerts, P. R. Pintrich, & M. Zeidner (Eds.), Handbook of self-regulation (pp. 687–726). San Diego, CA: Elsevier.
  • Department for Education. (2014). National curriculum in England: mathematics programmes of study https://www.gov.uk/government/publications/national-curriculum-in-england-mathematics-programmes-of-study
  • Dirkx, K., Joosten-ten Brinke, D., Arts, J., & van Diggelen, M. (2019). In-text and rubric-referenced feedback: Differences in focus, level, and function. Active Learning in Higher Education, 0(0), 1469787419855208. doi:10.1177/1469787419855208
  • Durand-Guerrier, V., Boero, P., Douek, N., Epp, S. S., & Tanguay, D. (2011). Argumentation and proof in the mathematics classroom. In G. Hanna & M. de Villiers (Eds.), Proof and proving in mathematics education (pp. 349–367). Dordrecht: Springer.
  • Enders, C. K. (2010). Applied missing data analysis. New York: Guildford.
  • Eriksson, E., Björklund Boistrup, L., & Thornberg, R. (2017). A categorisation of teacher feedback in the classroom: A field study on feedback based on routine classroom assessment in primary school. Research Papers in Education, 32(3), 316–332. doi:10.1080/02671522.2016.1225787
  • Fast, L. A., Lewis, J. L., Bryant, M. J., Bocian, K. A., Cardullo, R. A., Rettig, M., & Hammond, K. A. (2010). Does math self-efficacy mediate the effect of the perceived classroom environment on standardized math test performance? Journal of Educational Psychology, 102(3), 729–740. doi:10.1037/a0018863
  • Fujita, T., Jones, K., & Miyazaki, M. (2018). Learners’ use of domain-specific computer-based feedback to overcome logical circularity in deductive proving in geometry. ZDM, 50(4), 699–713. doi:10.1007/s11858-018-0950-4
  • Fyfe, E. R., & Brown, S. A. (2018). Feedback influences children's reasoning about math equivalence: A meta-analytic review. Thinking & Reasoning, 24(2), 157–178. doi:10.1080/13546783.2017.1359208
  • Fyfe, E. R., Rittle-Johnson, B., & DeCaro, M. S. (2012). The effects of feedback during exploratory mathematics problem solving: Prior knowledge matters. Journal of Educational Psychology, 104(4), 1094–1108. doi:10.1037/a0028389
  • Ginsburg, H. P. (2009). The challenge of formative assessment in mathematics education: Children's minds, teachers’ minds. Human Development, 52, 109–128.
  • Hanna, G. (2014). Mathematical proof, argumentation, and reasoning. In S. Lerman (Ed.), Encyclopedia of mathematics education (pp. 404–408). Dordrecht: Springer Netherlands. doi:10.1007/978-94-007-4978-8_102
  • Hargreaves, E. (2013). Inquiring into children’s experiences of teacher feedback: Reconceptualising assessment for learning. Oxford Review of Education, 39(2), 229–246. doi:10.1080/03054985.2013.787922
  • Hargreaves, E., McCallum, B., & Gipps, C. (2000). Teacher feedback strategies in primary classrooms - new evidence. In A. Susan (Ed.), Feedback for learning (pp. 21–31). London: Routledge.
  • Hattie, J., & Clarke, S. (2019). Visible learning feedback. London: Routledge.
  • Hattie, J., & Gan, M. (2011). Instruction based on feedback. In Handbook of research on learning and instruction (pp. 263–285). New York: Routledge.
  • Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112.
  • Hilbert, S., Bruckmaier, G., Binder, K., Krauss, S., & Bühner, M. (2019). Prediction of elementary mathematics grades by cognitive abilities. European Journal of Psychology of Education, 34(3), 665–683. doi:10.1007/s10212-018-0394-9
  • Hox, J. J., Moerbeek, M., & van de Schoot, R. (2018). Multilevel analysis: Techniques and applications. New York: Routledge.
  • Hsieh, F.-J., Horng, W.-S., & Shy, H.-Y. (2011). From exploration to proof production. In G. Hanna, & M. de Villiers (Eds.), Proof and proving in mathematics education (pp. 279–303). Cham, CH: Springer.
  • Hughes, G. B. (2010). Formative assessment practices that maximize learning for students at risk. In H. Andrade, & G. Cizek (Eds.), Handbook of formative assessment (pp. 212–232). New York: Routledge.
  • Jeannotte, D., & Kieran, C. (2017). A conceptual model of mathematical reasoning for school mathematics. Educational Studies in Mathematics, 96(1), 1–16.
  • Joët, G., Usher, E. L., & Bressoux, P. (2011). Sources of self-efficacy: An investigation of elementary school students in France. Journal of Educational Psychology, 103(3), 649.
  • Keiser, J. M., & Lambdin, D. V. (1996). The clock is ticking: Time constraint issues in mathematics teaching reform. The Journal of Educational Research, 90(1), 23–31. doi:10.1080/00220671.1996.9944440
  • Kempert, S., Saalbach, H., & Hardy, I. (2011). Cognitive benefits and costs of bilingualism in elementary school students: The case of mathematical word problems. Journal of Educational Psychology, 103(3), 547.
  • Kilpatrick, J., Swafford, J., & Findell, B. (Eds.). (2001). Adding it up: Helping children learn mathematics. Washington, DC: National Academic Press.
  • Knudsen, J., Lara-Meloy, T., Stevens Stallworth, H., & Wise Rutstein, D. (2014). Advice for mathematical argumentation. Mathematics Teaching in the Middle School, 19(8), 494–500. doi:10.5951/mathteacmiddscho.19.8.0494
  • Kroesbergen, E. H., & Van Luit, J. E. (2003). Mathematics interventions for children with special educational needs: A meta-analysis. Remedial and Special Education, 24(2), 97–114.
  • Li, H., Liu, J., Zhang, D., & Liu, H. (2020). Examining the relationships between cognitive activation, self-efficacy, socioeconomic status, and achievement in mathematics: A multi-level analysis. British Journal of Educational Psychology, 91(1), 101–126.
  • Lithner, J. (2000). Mathematical reasoning in task solving. Educational Studies in Mathematics, 41(2), 165–190. http://www.jstor.org/stable/3483188
  • Liu, O. L., Wilson, M., & Paek, I. (2008). A multidimensional rasch analysis of gender differences in PISA mathematics. Journal of Applied Measurement, 9(1), 18.
  • Lubienski, S. T., Robinson, J. P., Crane, C. C., & Ganley, C. M. (2013). Girls’ and boys’ mathematics achievement, affect, and experiences: Findings from ECLS-K. Journal for Research in Mathematics Education, 44(4), 634–645. doi:10.5951/jresematheduc.44.4.0634
  • Lüken, M. M., Peter-Koop, A., & Kollhoff, S. (2014). Influence of Early Repeating Patterning Ability on School Mathematics Learning. North American Chapter of the International Group for the Psychology of Mathematics Education.
  • Melhuish, K., Thanheiser, E., & Guyot, L. (2020). Elementary school teachers’ noticing of essential mathematical reasoning forms: Justification and generalization. Journal of Mathematics Teacher Education, 23(1), 35–67. doi:10.1007/s10857-018-9408-4
  • Muthén, B. O. (2010). Bayesian analysis in Mplus: A brief introduction. https://www.statmodel.com/download/IntroBayesVersion%203.pdf
  • Muthén, B. O., & Asparouhov, T. (2012). Bayesian structural equation modeling: A more flexible representation of substantive theory. Psychological Methods, 17(3), 313–335. doi:10.1037/a0026802
  • Muthén, L. K., & Muthén, B. O. (2017). Mplus user’s guide (8th ed.). Los Angeles, CA: Muthén & Muthén.
  • National Governors Association Center for Best Practices; Council of Chief State School Officers. (2010). Common core state standards for mathematics.
  • Pajares, F., & Graham, L. (1999). Self-efficacy, motivation constructs, and mathematics performance of entering middle school students. Contemporary Educational Psychology, 24(2), 124–139. doi:10.1006/ceps.1998.0991
  • Pepin, B., & Haggarty, L. (2001). Mathematics textbooks and their use in English, French and German classrooms. Zentralblatt für Didaktik der Mathematik, 33(5), 158–175.
  • Perry, T., Davies, P., & Brady, J. (2020). Using video clubs to develop teachers’ thinking and practice in oral feedback and dialogic teaching. Cambridge Journal of Education, 1–23. doi:10.1080/0305764X.2020.1752619
  • Pintrich, P. R., & De Groot, E. V. (1990). Motivational and self-regulated learning components of classroom academic performance. Journal of Educational Psychology, 82(1), 33.
  • Powell, S. R., Berry, K. A., & Barnes, M. A. (2020). The role of pre-algebraic reasoning within a word-problem intervention for third-grade students with mathematics difficulty. ZDM, 52(1), 151–163. doi:10.1007/s11858-019-01093-1
  • Rakoczy, K., Pinger, P., Hochweber, J., Klieme, E., Schütze, B., & Besser, M. (2019). Formative assessment in mathematics: Mediated by feedback's perceived usefulness and students’ self-efficacy. Learning and Instruction, 60, 154–165. doi:10.1016/j.learninstruc.2018.01.004
  • Rittle-Johnson, B., & Alibali, M. W. (2001). Developing conceptual understanding and procedural skill in mathematics: An iterative process. Journal of Educational Psychology, 93(2), 346–362.
  • Rittle-Johnson, B., & Schneider, M. (2015). Developing conceptual and procedural knowledge of mathematics. Oxford Handbook of Numerical Cognition, 1118–1134.
  • Rothermel Rawding, M., & Wills, T. (2012). Discourse: Simple moves that work. Mathematics Teaching in the Middle School, 18(1), 46–51. doi:10.5951/mathteacmiddscho.18.1.0046
  • Rubin, D. B. (2004). Multiple imputation for nonresponse in surveys. Hoboken, NJ: John Wiley & Sons.
  • Ruiz-Primo, M. A., & Min, L. (2013). Examining formative feedback in the classroom context: New research perspectives. In J. H. McMillan (Ed.), SAGE Handbook of research on classroom assessment (pp. 215–232). Thousand Oaks, CA: SAGE Publications, Inc. doi:10.4135/9781452218649
  • Schoenfeld, A. H. (1985). Mathematical problem solving. Orlando, FL: Academic Press.
  • Schoenfeld, A. H. (2012). Problematizing the didactic triangle. ZDM, 44(5), 587–599.
  • Schoenfeld, A. H. (2015). Summative and formative assessments in mathematics supporting the goals of the common core standards. Theory Into Practice, 54(3), 183–194. doi:10.1080/00405841.2015.1044346
  • Schunk, D. H. (1995). Self-Efficacy and Education and instruction. In J. E. Maddux (Ed.), Self-Efficacy, adaptation, and adjustment: Theory, research, and application (pp. 281–303). Boston, MA: Springer US. doi:10.1007/978-1-4419-6868-5_10
  • Schunk, D. H., & Pajares, F. (2005). Competence perceptions and academic functioning. In A. J. Elliot, & C. S. Dweck (Eds.), Handbook of competence and motivation (pp. 85–104). New York: Guildford.
  • Schütze, B., Rakoczy, K., Klieme, E., Besser, M., & Leiss, D. (2017). Training effects on teachers’ feedback practice: The mediating function of feedback knowledge and the moderating role of self-efficacy [journal article]. ZDM, 49(3), 475–489. doi:10.1007/s11858-017-0855-7
  • Semadeni, Z. (1984). Action proofs in primary mathematics teaching and in teacher training. For the Learning of Mathematics, 4(1), 32–34. http://www.jstor.org/stable/40247842
  • Sfard, A. (2001). There is more to discourse than meets the ears: Looking at thinking as communicating to learn more about mathematical learning. Educational Studies in Mathematics, 46(1/3), 13–57. http://www.jstor.org/stable/3483239
  • Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189.
  • Smit, R. (2009). Die formative Beurteilung und ihr Nutzen für die Entwicklung von Lernkompetenz. Baltmannsweiler: Schneider Verlag Hohengehren.
  • Smit, R., Bachmann, P., Blum, V., Birri, T., & Hess, K. (2017). Effects of a rubric for mathematical reasoning on teaching and learning in primary school. Instructional Science, 45(5), 603–622. doi:10.1007/s11251-017-9416-2
  • Smit, R., & Birri, T. (2014). Assuring the quality of standards-oriented classroom assessment with rubrics for complex competencies. Studies in Educational Evaluation, 43(December), 5–13. doi:10.1016/j.stueduc.2014.02.002
  • Smit, R., Hess, K., Bachmann, P., Blum, V., & Birri, T. (2019). What happens after the intervention? Results from teacher professional development in employing mathematical reasoning tasks and a supporting rubric. Frontiers in Education, 3, doi:10.3389/feduc.2018.00113
  • Stanat, P., & Christensen, G. (2006). Where immigrant students succeed: A comparative review of performance and engagement in PISA 2003. Paris: OECD.
  • Stein, M. K., Grover, B. W., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in reform classrooms. American Educational Research Journal, 33(2), 455–488.
  • Stylianides, G. J. (2008). An analytic framework of reasoning-and-proving. For the Learning of Mathematics, 28(1), 9–16.
  • Stylianides, G. J. (2014). Textbook analyses on reasoning-and-proving: Significance and methodological challenges. International Journal of Educational Research, 64(0), 63–70. doi:10.1016/j.ijer.2014.01.002
  • Swanson, H. L. (2014). Does cognitive strategy training on word problems compensate for working memory capacity in children with math difficulties? Journal of Educational Psychology, 106(3), 831–848. doi:10.1037/a0035838
  • Swiss Conference of Cantonal Ministers of Education (EDK). (2011a). Basic competences in mathematics. National standards. Retrieved 23.10.13, from http://edudoc.ch/record/96784/files/grundkomp_math_d.pdf
  • Swiss Conference of Cantonal Ministers of Education (EDK). (2011b). National standards for compulsory education: basic competences for four subjects. Retrieved January 10, 2012, from https://www.edk.ch/dyn/12930.php
  • Tait-McCutcheon, S. (2008). Self-efficacy in mathematics: Affective, cognitive, and conative domains of functioning. Proceedings of the 31st annual conference of the Mathematics Education group of australasia.
  • Tanner, H., & Jones, S. (2003). Self-Efficacy in mathematics and students’ Use of self-regulated learning strategies during assessment events. International Group for the Psychology of Mathematics Education, 4, 275–282.
  • Toulmin, S. E. (2003). The uses of argument. Cambridge: Cambridge University Press.
  • Turner, S. L. (2014). Creating an assessment-centered classroom: Five essential assessment strategies to support middle grades student learning and achievement. Middle School Journal, 45(5), 3–16.
  • Urdan, T., & Turner, J. C. (2007). Competence motivation in the classroom. In A. J. Elliot, & C. S. Dweck (Eds.), Handbook of competence and motivation (pp. 297–317). New York: The Guildford Press.
  • van de Schoot, R., & Depaoli, S. (2014). Bayesian analyses: Where to start and what to report. The European Health Psychologist, 16(2), 75–84.
  • van der Scheer, E. A., Bijlsma, H. J. E., & Glas, C. A. W. (2019). Validity and reliability of student perceptions of teaching quality in primary education. School Effectiveness and School Improvement, 30(1), 30–50. doi:10.1080/09243453.2018.1539015
  • Verschaffel, L., Schukajlow, S., Star, J., & Van Dooren, W. (2020). Word problems in mathematics education: A survey. ZDM, 52(1), 1–16. doi:10.1007/s11858-020-01130-4
  • Viholainen, A. (2011). The view of mathematics and argumentation. Proceedings of the 7th congress of the European society for research in mathematics education, 9th-13th February, University of Rzeszów, Poland.
  • Von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful. IERI Monograph Series, 2, 9–36.
  • Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press.
  • Webb, D. C. (2004). Discourse based assessment in the mathematics classroom: A study of teachers’ instructionally embedded assessment practices. In T. A. Romberg (Ed.), Standards-based mathematics assessment in middle school (pp. 169–187). New York: Teachers College Press.
  • Whitenack, J., & Yackel, E. (2002). Making mathematical arguments in the primary grades: The importance of explaining and justifying ideas. Teaching Children Mathematics, 8(9), 524–528.
  • Wolbers, K. A., Dostal, H. M., Skerrit, P., & Stephenson, B. (2017). The impact of three years of professional development on knowledge and implementation. The Journal of Educational Research, 110(1), 61–71. doi:10.1080/00220671.2015.1039112
  • Wyndhamn, J., & Säljö, R. (1997). Word problems and mathematical reasoning—A study of children's mastery of reference and meaning in textual realities. Learning and Instruction, 7(4), 361–382. doi:10.1016/S0959-4752(97)00009-1

Appendix

Scale “Self-efficacy”

Scale “Formative Feedback”