4,427
Views
9
CrossRef citations to date
0
Altmetric
Articles

Improving procedural and conceptual mathematics outcomes: evidence from a randomised controlled trial in Kenya

, , &
Pages 404-422 | Received 10 Oct 2014, Accepted 29 Jan 2016, Published online: 17 Mar 2016

ABSTRACT

To improve learning outcomes, an intervention in Kenya called the Primary Math and Reading (PRIMR) Initiative provided pupil learning materials, teachers’ guides and modest teacher professional development in mathematics. This paper presents the causal impact of PRIMR’s mathematics intervention on pupil achievement indices for procedural and conceptual numeracy, using a differences-in-differences analytic strategy. The mathematics intervention produced modest, statistically significant results: generally similar results for males and females, a larger impact in grade 2 than grade 1, a larger impact in nongovernment schools than public schools, and smaller outcomes in mathematics than for English or Kiswahili. These findings have relevant policy implications in Kenya given an impending national mathematics programme.

1. Introduction

1.1 Purpose of the research

In its Vision 2030, a plan to ensure that Kenya becomes a middle-income country, the Kenyan government emphasises the quality of mathematics education as a key tool to support innovation (Government of Kenya Citation2007). Numerous other sub-Saharan African countries – such as South Africa, Tanzania and Uganda – have made efforts to bring mathematics education to the fore (Fleisch and Schöer Citation2014; UNESCO Citation2014; Uwezo Citation2013). Mathematics has received increased attention from bilateral and multilateral donors focused on improving education outcomes (GIZ and BMZ Citation2012; GPE Citation2012b; JICA Citation2006; UNESCO Citation2012, Citation2014; UNESCO and CASTME Citation2001).

These countries and organisations recognise the importance of mathematics, science and technology for individual and national economic growth (Hanushek and Woessmann Citation2007; UNESCO and CASTME Citation2001; Wagner et al. Citation2001) and the fact that developing skills in these areas requires solid mathematics teaching in the early grades (Duncan et al. Citation2007; GPE Citation2012a; UNESCO Citation2012).

Kenya has an increased emphasis on improving primary mathematics outcomes. This includes the Centre for Mathematics, Science and Technology Education in Africa (CEMASTEA) programme, funded by JICA, and the Reading to Learn programme implemented by the Aga Khan Foundation and funded by the United States Agency for International Development (USAID)/Kenya and the William and Flora Hewlett Foundation.

Based on the disappointing findings from a 2009 Early Grade Mathematics Assessment (EGMA) in Malindi district (Reubens and Kline Citation2009), Kenya’s Ministry of Education, Science and Technology (MoEST) requested USAID support in improving the quality of instruction in mathematics, as well as in literacy in the languages of English and Kiswahili. The Primary Math and Reading (PRIMR) Initiative was implemented in seven counties and 1384 schools from 2011 to 2014 (funded by USAID) and 2012 to 2015 (funded by the UK Department for International Development [DFID]). PRIMR supplied pupils with books and teachers with training, teachers’ guides and instructional support. PRIMR was organised as a randomised controlled trial to better measure programme effects.

In the section that follows, we present background on the design of several relevant early grade mathematics improvement programmes, and the limited research on whether these programmes are effective in developing countries.

1.2 Background on mathematics

1.2.1 Mathematics approaches

There is growing agreement that the way in which sub-Saharan countries have traditionally taught mathematics is insufficient. Mathematics has often been taught in a disconnected manner, with students encouraged to memorise standard algorithms (Fuson, Kalchman, and Bransford Citation2005; Nag et al. Citation2014; UNESCO Citation2012), often failing to capitalise on the real-world nature of mathematics (Nag et al. Citation2014; UNESCO Citation2012; UNESCO and CASTME Citation2001).

In contrast, current research on mathematics teaching and learning holds that it is critical for pupils to build a deep conceptual understanding of mathematics alongside procedural fluency, which should be taught using a variety of techniques (Clements and Sarama Citation2009; Commonwealth of Australia Citation2008; Dahl Soendergaard and Cachaper Citation2011; Fuson, Kalchman, and Bransford Citation2005). The US National Academy of Sciences (Donovan and Bransford Citation2005; Kilpatrick, Swafford, and Findell Citation2001) defined the three main areas of mathematical proficiency as ‘a foundation of factual knowledge (procedural fluency), tied to a conceptual framework (conceptual understanding) and organised in a way to facilitate retrieval and problem solving (strategic competence)’ (Donovan and Bransford Citation2005, 218). This basic framework of mathematical proficiency is echoed across several contexts (Commonwealth of Australia Citation2008; Dahl Soendergaard and Cachaper Citation2011; Fuson, Kalchman, and Bransford Citation2005; UNESCO Citation2012).

1.2.2 Building mathematical proficiency

Programmes that support mathematics education in the primary grades must build proficiency in conceptual understanding, procedural fluency and strategic competence (Kilpatrick, Swafford, and Findell Citation2001). Having a conceptual understanding of what an operation represents allows students to transfer those skills practically (Fuson, Kalchman, and Bransford Citation2005; Nunes et al. Citation2012; United States Department of Education Citation2008). This requires the development of number sense, which refers to students’ understanding of numbers, their magnitude, relationships between numbers (Ansari Citation2012; Dahl Soendergaard and Cachaper Citation2011) and working with patterns (Clements and Sarama Citation2009; Fuson, Kalchman, and Bransford Citation2005).

Procedural fluency is defined as the ‘skill in carrying out procedures flexibly, accurately, efficiently, and appropriately’ (Kilpatrick, Swafford, and Findell Citation2001, 5). Schools in Kenya traditionally have created this fluency through repetition and drill (UNESCO Citation2012), but procedural fluency is more effectively taught by helping students to know a variety of strategies (Dahl Soendergaard and Cachaper Citation2011; Fuson, Kalchman, and Bransford Citation2005).

Strategic competence allows a student to identify the most productive way to solve a particular problem (Kilpatrick, Swafford, and Findell Citation2001) which is essential for higher-level mathematics (NCTM Citation2000; van de Walle, Karp, and Bay-Williams Citation2010). This includes the ability to understand how the four basic operations are connected, as well as to look for patterns when counting, doing spatial reasoning or manipulating geometric shapes (Nunes et al. Citation2012). Students should be encouraged to communicate their mathematical thinking (Commonwealth of Australia Citation2008; Dahl Soendergaard and Cachaper Citation2011; Fuson, Kalchman, and Bransford Citation2005). By vocalising their thinking, students identify productive approaches (Griffin Citation2005). This helps students view mathematics as something that can be understood, not as a set of complex rules (Fuson, Kalchman, and Bransford Citation2005; Griffin Citation2005).

1.2.3 Challenges in building mathematical proficiency

Questions remain about how best to improve mathematics instruction, particularly in developing-country contexts, where shifts towards newer mathematics programme designs have taken place recently (UNESCO Citation2012, Citation2014). One of the greatest challenges to overcome relates to teachers’ prior experience with and competence in mathematics. Teachers in Kenya and similar contexts have laid a heavy emphasis on basic computational formulas (Akyeampong et al. Citation2013; UNESCO Citation2012). In Kenya, the eligibility criteria for mathematics teachers are low, with a grade of ‘D’ required in the Kenya Certificate of Secondary Education (KCSE) in order to be eligible for the Primary Teacher Training Colleges (PTTCs) (Bunyi et al. Citation2013; Cherotich, Njagi, and Jepkemei Citation2014). This suggests that the majority of primary school teachers might not be proficient in basic mathematics, and that even fewer may have a deeper understanding of primary mathematics (Schwille and Dembélé Citation2007; UNESCO Citation2012). In the nonformal schoolFootnote1 subsector in Kenya, many teachers have somewhat lower formal school credentials (Ong’ele and Piper Citation2014). Teachers without a strong conceptual understanding themselves may struggle to help teach children a rich mathematics understanding (Akyeampong et al. Citation2013; Schwille and Dembélé Citation2007; UNESCO Citation2012).

Challenges to implementing effective early mathematics programmes are amplified in developing-country contexts. Supporting pupils’ mathematics understanding and complex mathematical thinking demands continuous assessment (Barmby et al. Citation2007; GIZ and BMZ Citation2012; UNESCO Citation2012). Simple tests to measure how well students can calculate do not give teachers sufficient knowledge of students’ growth (Barmby et al. Citation2007; GIZ and BMZ Citation2012; UNESCO Citation2012). Such impediments are magnified in large classrooms, where teachers have less time to pay attention to students’ mistakes (IMU Citation2009). Ensuring that students have manipulatives that help them develop a strong number sense is more difficult in resource-poor environments (Ralaingita Citation2008).

1.2.4 Maths interventions in developing countries

Reviews of programmes in the USA (Slavin and Lake Citation2007) and South Africa (Fleisch and Schöer Citation2014) indicate that focus on instructional change is most likely to have positive outcomes (Slavin and Lake Citation2007; Staub and Stern Citation2002). Research in sub-Saharan Africa suggests that to support teacher learning, modelling and teacher coaching are necessary (Schwille and Dembélé Citation2007; UNESCO Citation2012; UNESCO and CASTME Citation2001). Coaching can be particularly effective if the coach is given a manageable number of teachers to support, and the coaches are given financial stipends and training (Craig, Kraft, and du Plessis Citation1998; Piper and Zuilkowski Citation2015).

There are few programmes that have improved early grade mathematics outcomes in developing-country contexts, particularly in sub-Saharan Africa (Nag et al. Citation2014; UNESCO Citation2012). Banerjee et al. (Citation2007) reported on two maths interventions in India, one focused on providing one-on-one remedial instruction for struggling students, and the other a computer-assisted learning programme. They reported an effect size of 0.29 standard deviations (SD) and 0.47 SD in the first and second programmes, respectively. A recent supplemental mathematics intervention in Jordan improved mathematics outcomes on five of the six intervention areas (RTI International Citation2014b). The East African Quality in Early Learning (EAQEL) mathematics programme, undertaken in Uganda and Kenya between 2009 and 2011, supported teachers to improve the quality of their literacy and numeracy instructional methods in grades 1 to 3. The evaluation, which used a differences-in-differences (DID) identification strategy, showed no effect of EAQEL on mathematics outcomes in either Uganda or Kenya. Ho and Thukral (Citation2009) reviewed the results of interactive radio instruction (IRI) programmes in mathematics. Older IRI maths programmes in Thailand, Honduras, Nicaragua and Bolivia resulted in positive gains, with effect sizes between 0.24 SD (in Bangkok) and 1.32 SD (in Nicaragua). More recent (2003–2007) IRI programmes in Sudan and Zambia had effect sizes ranging from 0.28 SD (Sudan) up to 0.54 SD (Zambia).

A number of other programmes are taking place or recently began in sub-Saharan Africa, such as the Numeracy Boost programmes currently being implemented in Malawi and Ethiopia; and USAID-funded interventions in the Democratic Republic of the Congo (Bulat et al. Citation2012), Liberia (DeStefano, Slade, and Korda Citation2013), Rwanda (Education Development Center, Inc., Citationn.d.) and South Africa (Fleisch and Schöer Citation2014; Vale et al. Citation2010). However, some of these programmes have reported design issues that have reduced the validity of preliminary results, such as sampling (Bulat et al. Citation2012), delayed intervention (DeStefano, Slade, and Korda Citation2013), or problems with equating instruments (Fleisch and Schöer Citation2014). Therefore, the knowledge base for improving early primary mathematics education in sub-Saharan Africa remains limited.

2. Materials and methods

2.1 Research questions

An examination of prior studies revealed a knowledge gap regarding the impact of early mathematics interventions in developing countries. Few such studies have used rigorous designs organised to estimate causal impact of numeracy programmes. The research design of the PRIMR Initiative provided an opportunity to measure the impact of a structured numeracy programme on student outcomes in Kenya. Our research questions were: (1) Did PRIMR have an impact on mathematics outcomes? If yes, did the effect differ by gender? (2) Again if yes, did the impact of PRIMR differ by grade or by public and nonformal school setting? (3) Did any impact of PRIMR on mathematics outcomes differ from the PRIMR impact on literacy outcomes?

2.2 Mathematics in PRIMR

2.2.1 PRIMR numeracy programme design

As stated in the introduction, PRIMR was a USAID- and DFID-funded MoEST programme designed to improve the quality of literacy and numeracy outcomes in grades 1 and 2. It included interventions in literacy and numeracy, in both public and low-cost private (nonformal) schools. Recent research has evaluated the impact of PRIMR on literacy outcomes, but this paper contains the first evaluation of PRIMR’s impact on numeracy. PRIMR’s numeracy programme was implemented in 182 public schools in 4 counties and 229 nonformal schools in Nairobi County by October 2013, when the endline evaluation was carried out. The programme was built on the Kenyan curriculum, which had many of the elements necessary for basic primary outcomes (Kenya Institute of Education Citation2002Footnote2; NCTM Citation2014). The PRIMR classroom materials were sequenced with a focus on ensuring efficiency, providing opportunities to reinforce concepts, helping students see the connections between concepts, and teaching multiple strategies.

The PRIMR mathematics programme was designed by technical specialists from the MoEST and PRIMR, following the Kenya Institute of Curriculum Development (KICD) curriculum. The lesson plans began with important skills – that is, new concepts and strategies to be introduced, or skills taught earlier to be reviewed – and concluded with exercises for students to demonstrate what they had learned. Daily lessons began with the teacher modelling a concept, the teacher guiding the pupils to practise the concept, pupils practising a concept independently and the teacher providing formative feedback. Students wrote in a PRIMR workbook as they did their classwork or homework. The Appendix includes sample pages from the student workbook and teachers’ guides, both from Week 2 Day 3 of the grade 2 materials.

The Kenyan government had a group of instructional supervisors – Teachers’ Advisory Centre (TAC) tutors – to support teachers in public schools. The TAC tutors work for the government-run Teachers’ Service Commission and are supervised by the staffing officer at the county level, but in reality, the TAC tutors often serve various roles under sub-county education officers. Prior research showed that TAC tutors were spending much more time on administrative tasks than on instructional support (Piper and Zuilkowski Citation2015). In PRIMR schools, therefore, the roles of the participating TAC tutors were specified as (1) the primary trainers for teachers and (2) instructional supervisors, providing feedback to teachers. The TAC tutors – and their private sector equivalents, instructional coaches who were recruited to serve PRIMR nonformal schools – were trained in these responsibilities. The TAC tutors and coaches were issued classroom observation instruments to use during classroom visits, to take note of instructional quality and areas of success and improvement. After every visit, the TAC tutors and coaches offered individual feedback to teachers about the quality of their instruction, and indicated areas in which the teachers could improve. The observation tool evaluated how well the teachers used the teachers’ guide; how successfully they implemented PRIMR’s instructional techniques; and how and whether pupils were learning, based on continuous assessment measures from a random subsample of pupils.

Teacher trainings were undertaken in short blocks before each of three terms (Kenya’s school year is January–December). The PRIMR training for teachers was 10 days each year, with 3 of those 10 days for mathematics. As indicated by the samples in the Appendix, PRIMR teachers were given structured teachers’ guides which related directly to each page of the pupil workbooks and helped pupils and teachers use mathematics vocabulary. The teachers’ guides included a list of suggested learning resources for each lesson; weekly tests to measure learners’ progress; and a range of daily activities for the teacher, many of which the pupils were then expected to practise in their workbook. The teachers’ guides were relatively scripted, providing teachers with suggested language to use. Pupils practised by using their workbooks, and teachers graded the exercises to give consistent feedback.

2.2.2 PRIMR numeracy implementation

Implementation of the numeracy component of the PRIMR intervention began in July 2012, six months after the literacy programme began.Footnote3 Due to the term break in August 2012, a nationwide strike by public school teachers in September 2012, and midterm data collection in October 2012, PRIMR teachers had less than one month to implement the PRIMR mathematics programme prior to the midterm evaluation. Even with that limited time frame, outcomes from the October 2012 midterm assessment showed statistically significant gains on some measures (Piper and Mugenda Citation2013).Footnote4 However, the PRIMR team was unconvinced that the gains were due to the impact of those few weeks of maths implementation.

PRIMR schools continued implementing the mathematics programme in January 2013, the beginning of the academic year. The intervention included TAC tutor and coach training, teacher training, materials provision and instructional support. The programme team led one day of mathematics instructional training per term for the tutors and coaches. In January, the training focused on introducing the teachers to the lesson plans and helping them understand how to undertake weekly assessments. The teachers also received technical introductions on teaching key areas of mathematics. The May mathematics training concentrated on preparing for mathematics lessons, and using accompanying teaching and learning materials, including number flashcards. The September training covered specific strategies that had proved difficult for teachers in the previous sessions.

2.3 Research design

The PRIMR Initiative was designed as a randomised controlled trial using cross-sectional data analysis. In the PRIMR counties, zones were randomly selected and assigned to treatment and control groups. In the nonformal school subsector, the PRIMR team clustered schools into groups of 10 and 15 schools and randomly assigned these clusters to treatment and control groups. PRIMR had three cohorts. The first cohort received the intervention in 2012; the second cohort received the intervention in 2013; and the final cohort, which served as the control throughout the period of the analysis, served as a control group until interventions began in 2014. Each cohort contained both public and low-cost private schools.

PRIMR deployed a two-stage randomised cluster design. Given that public schools are supervised by TAC tutors, each of whom supports a set of between 12 and 30 schools, the assignment to treatment had to consider that clustering. For PRIMR, the first stage used the clustered schools at the zonal level, while the second stage clustered pupils in the selected schools. Standard errors were recalculated to account for the intracluster correlation at both levels, using the svy commands in Stata. Additionally, bias in the pupil score estimates as a result of underrepresentation of subpopulations was reduced through post-stratification weights added at the pupil level. The overall probability of pupil section was the product of the two-stage selection and the final weights were calculated as the inverse of the probability of overall pupil selection. presents the research design.

Figure 1. Implementation of PRIMR intervention and timing of EGMA administrations.

Source: Piper and Mugenda (Citation2014, p. 16).

Figure 1. Implementation of PRIMR intervention and timing of EGMA administrations.Source: Piper and Mugenda (Citation2014, p. 16).

Below we provide more information about the mathematics portion of PRIMR.

2.3.1 Site

The PRIMR mathematics programme was implemented alongside the literacy programme in a randomly selected set of peri-urban and rural zones within Nairobi, Kiambu and Nakuru counties. These three counties were selected by the MoEST and USAID in part because of violence that occurred after the 2007 presidential election. These counties are relatively wealthy compared to the average county in Kenya.

As discussed above, PRIMR also was implemented within a randomly selected set of low-cost private schools in Nairobi’s informal settlements, otherwise known as slums. These informal settlements provide homes for a large percentage of Nairobi’s population – by some measures, more than half (UN-Habitat Citation2013). Research differs on whether the low-cost private schools provide higher quality educational outcomes than the public schools with which they compete. For example, these nonformal schools have produced lower outcomes on materials and teacher training (Bray Citation1999), but have reported more classroom observations of teachers on a monthly basis (Piper and Mugenda Citation2013). Oketch et al. (Citation2012) argued that learning outcomes were higher in these schools, and many parents reportedly view the quality of outcomes in nonformal schools as higher. Previous research on PRIMR showed very modest differences between student characteristics in public and nonformal schools (Piper, Zuilkowski, and Mugenda Citation2014). For example, maternal literacy was 92.3 per cent at the baseline in public control schools and 93.1 per cent in PRIMR in nonformal control schools. Preschool attendance was also similar, with 93.7 per cent attendance rates in public control schools and 97.7 per cent attendance in nonformal control schools (Piper, Zuilkowski, and Mugenda Citation2014).Footnote5

2.3.2 Sample

PRIMR supported 310 schools in implementing the mathematics programme by the end of the USAID experiment in October 2013, with 101 schools serving as control schools. These control schools received the PRIMR treatment after the endline assessment had been completed in October 2013. The treatment schools were clustered into 15 zones (in public schools) and in 16 clusters (in nonformal schools). summarises the numbers of schools and pupils in the sample.

Table 1. PRIMR school and pupil sample.

During the baseline and midterm data collection rounds, schools were randomly selected from the clusters or zones of schools, whereas for the endline evaluation, the schools selected were the same as at the baseline. At the school level, assessors were trained to select the pupils using simple random sampling. Pupil selections at baseline, midterm and endline were done independently. Specifically, that meant that all of the students queued up by grade – boys in one queue and girls in another – and then the children were randomly selected using a sampling interval from the student population.

2.3.3 Measures

The use of the EGMA to measure pupils’ mathematics outcomes began in Kenya in 2009, in Malindi district (Reubens and Kline Citation2009; RTI International Citation2014a). An EGMA adaptation workshop, organised to localise and validate the EGMA for the Kenyan context and language, was held in October 2011 with officers from KICD and the Kenya National Examinations Council (KNEC), as well as several MoEST officers. Other participants included university faculty, mathematics consultants and teachers. The participants used the KICD mathematics curricula for grades 1 and 2 to prepare the tools. For training and to uncover any floor or ceiling effects, participants conducted a pilot administration of the EGMA and other tools with grade 1 and 2 pupils in several schools.Footnote6

The EGMA assessed a set of critical numeracy skills in grades 1 and 2 orally and one-on-one. presents the eight mathematics subtasks, which were assessed using either Kiswahili or English, depending on which of the two languages the pupil was most comfortable with. The table also indicates how the measure was defined and whether it was timed.Footnote7 For purposes of this analysis, the three procedural variables (correct number identification per minute, correct addition problems per minute and correct subtraction problems per minute) were combined into a procedural index. The five variables that required conceptual understanding to correctly answer (quantitative comparison problems, missing number problems, addition problems – level 2, subtraction problems – level 2 and word problems) were combined into a conceptual index. The grouped variables were examined using correlation and principal components to ensure that they were highly associated and suitable for combining into an index. The procedural scores had no correlation coefficient less than 0.59 and the conceptual scores no less than 0.42 (partially due to some of the scores being discrete, rather than continuous). The scores were then standardised and combined, each variable having equal weight. Finally, for aesthetic purposes, each index was increased by the minimum value such that each index had a minimum of zero. These indices are useful tools to track changes in pupil performance among multiple evaluations. In order to limit the risk of generating spurious results, we primarily present the results of PRIMR’s effect on the procedural index and the conceptual index.

Table 2. PRIMR mathematics subtasks and indices.

2.3.4 Data collection

During the baseline assessment in January 2012, the PRIMR assessorsFootnote8 used paper forms, and in October 2013, the PRIMR assessors used electronic data collection tools. For the baseline, 68 assessors were trained during for five days in early January 2012. For the endline assessment, 52 assessors received five days of training in September 2013. Interrater reliability scores were 96 per cent at baseline and 93 per cent at endline.

2.4 Data analysis: differences in differences

The question of whether the PRIMR endline data analyses should use (1) a simple comparison of mean scores at the endline, depending on the randomised selection and assignment; or (2) DID analysis, was an important methodological decision. The analysis of the PRIMR literacy programme at the midterm used the DID estimate because of the differences in literacy outcomes at the baseline. We undertook a balance test on the January 2012 baseline to determine whether there were statistically significant differences between treatment groups in the mathematics outcomes and indices. For the procedural index, we found a statistically insignificant 0.10 SD effect size difference for the treatment groups for grade 1 (p-value .11) and for grade 2 (p-value .15). For the conceptual index, the results showed a statistically insignificant 0.11 SD difference in grade 1 (p-value .09) and a 0.19 SD difference in grade 2 (p-value <.001).

For the measures themselves, the baseline data analyses showed that treatment schools had statistically significantly different results from control schools on several measures, as shown in . Specifically, treatment schools outperformed control schools by 2.1 numbers identified per minute (p-value <.001), 3.5 percentage points on quantity discrimination (p-value <.001), 1.9 percentage points on missing numbers (p-value <.001), 2.0 percentage points on addition level 2 (p-value .04) and 1.3 percentage points on subtraction level 2 (p-value .01). Both the procedural index (p-value .02) and the conceptual index (p-value <.001) showed results that were somewhat higher for treatment than control. The effect sizes for those differences were 0.10 SD for procedural and 0.15 SD for conceptual. This level of difference requires an adjustment to satisfy baseline equivalence expectations (Institute of Education Sciences, United States Department of Education Citation2014).

Table 3. Baseline comparisons on mathematics outcomes (standard errors in parentheses).

The unbalanced baseline sample means that utilising the randomised selection of treatment and control to estimate outcomes has the potential to bias the results. Therefore, drawing on an identification strategy used in previous PRIMR research in literacy, we were able to take advantage of PRIMR’s design and attempt to account for these baseline differences by fitting a DID model. Because the results showed that the control variables (gender and student wealthFootnote9) were statistically significantly correlated with the majority of the mathematics variables, we included these controls in the models discussed below.

DID models compare changes in a programme’s outcome variables at different assessment points for treatment and control groups by removing the secular trend (that is, the change in outcome for the control groups over time). This allows the analysts to separate programme impact from changes in the population not due to programme impact (Murnane and Willett Citation2011). We focused on the changes in learning outcomes between the PRIMR baseline in January 2012 and the PRIMR endline in October 2013, thereby both adhering to the principle of parsimony and sidestepping the limited duration of the PRIMR maths intervention before the midterm analysis in October 2012.

We fit the DID model to a data set that contained four groups of students, differentiated by whether they attended schools randomly assigned to treatment or control groups and by assessment round: January 2012 (baseline) or October 2013 (endline). To answer the research questions in this paper, we fit the following statistical model:

The treatment effect for the cross-sectional analysis is the DID mean. The effect size measures the magnitude or strength of the treatment effect. The effect size used for the PRIMR study was Cohen’s d (Cohen Citation1988), which is the DID effect divided by the pooled standard deviation, thus:

Using the svy commands in Stata, we were able to fit a statistical model in order to account for the nested nature of schools and students, and use standard errors that accounted for that nesting.

3. Results

Research Question 1: Did PRIMR have an impact on mathematics outcomes? If yes, did the effect differ by gender?

In order to answer the research question examining whether the PRIMR maths programme improved maths outcomes, we present results from several DID models, with particular attention to the DID estimator. In , we present the overall causal effect from the DID estimator, along with statistical significance and effect sizes with 95 per cent confidence intervals. This table shows that PRIMR increased outcomes on the procedural and conceptual index for grade 2 and the procedural index for grade 1. The procedural index showed a causal effect of 0.20 SD in grade 1 (p-value .03) and 0.37 SD in grade 2 (p-value <.001). PRIMR had no causal effect (0.16 SD) on the conceptual index in grade 1 (p-value .15) but improved the conceptual index by 0.33 SD in grade 2 (p-value <.01).

Table 4. DID estimates of PRIMR treatment effects on outcome measures, by grade, gender and school type. Treatment effect standard errors and effect size 95 per cent confidence intervals (CIs) in parentheses.

The results for males and females were consistent across measures, as the next columns in show. Positive effects were found for both genders in grade 2 in both the procedural and conceptual indices. No effect was found for either gender in the conceptual index for grade 1. The only difference by gender was on the procedural index in grade 1, where there was a 0.26 SD effect for males (p-value .02) but the 0.13 SD effect was statistically insignificant for females (p-value .27).

Research Question 2: Did the impact of PRIMR on mathematics outcomes differ by grade and by public and nonformal school setting?

In order to answer the research question about whether the impact of PRIMR on maths differed by grade and public or nonformal school, we fit DID models specific for each grade and whether the school was formal or nonformal. The results in above showed that PRIMR had an impact on the procedural index in both grades 1 and 2 in nonformal schools, with effect sizes nearing 0.7 SD, but no statistically significant impact on those outcomes in the public schools. For the conceptual index, PRIMR had no impact on either public or nonformal outcomes in grade 1, but improved the learning outcomes for both public and nonformal schools in grade 2, with effect sizes over 0.3 SD. Thus, PRIMR increased procedural outcomes in grades 1 and 2 in nonformal schools only and conceptual outcomes in grade 2 but not grade 1, for both public and nonformal schools.

Research Question 3: Did the impact of PRIMR on mathematics outcomes differ from the impact of PRIMR on literacy outcomes?

In order to answer the research question regarding whether the impact of PRIMR on mathematics was different from the impact on literacy outcomes, we generated . This figure presents the effect size of the impact of PRIMR on the procedural and conceptual indices compared with the average effect of PRIMR on literacy outcomes in English and Kiswahili (Piper, Jepkemei, and Kibukho Citation2015). These findings showed that the impact of PRIMR on mathematics was somewhat smaller than its effects on either English or Kiswahili.

Figure 2. PRIMR effect on maths and literacy.

Figure 2. PRIMR effect on maths and literacy.

4. Discussion

In this section we discuss the findings for the Kenyan context. The findings showed that PRIMR had an impact on mathematics outcomes in Kenya, although the effect size differed by index. The DID model results yielded a somewhat more conservative effect than did the simple endline comparison between treatment and control. This increased precision supports the decision to fit a DID design even within an RCT (Murnane and Willett Citation2011).

We found that PRIMR had a statistically significant effect on the procedural components of numeracy in both grades 1 and 2, with a somewhat larger effect in grade 2. The conceptual index showed an effect in grade 2, but not in grade 1. It makes sense intuitively that the procedural effects of PRIMR were easier to see more quickly, and that longer intervention periods may have been required to observe an effect on the more complex conceptual elements of mathematics. Evidence from PRIMR’s monitoring data suggested a few reasons for this finding. First, only 24.8 per cent of lesson observations made by TAC tutors were in mathematics rather than literacy. This suggests less emphasis on maths by the TAC tutors and coaches, and potentially by the PRIMR intervention itself. With this more limited emphasis, tutors and coaches might have had difficulty mastering the approach well enough to ensure their teachers’ success in instructional improvement in the higher-order mathematics included in the conceptual index. Second, anecdotal evidence showed that teachers were not very adept at asking the ‘why’ and ‘how’ questions that the PRIMR teachers’ guides suggested. Pupils were therefore not given enough opportunities to expand their thinking in ways that would improve their outcomes on the conceptual subtasks. Finally, the lessons in mathematics took longer than either English or Kiswahili, indicating the possibility of either limited content mastery by teachers, making the more complex activities in the PRIMR teachers’ guides difficult for teachers to implement; or an overambitious estimate by the PRIMR technical experts of what could be taught in Kenyan mathematics classrooms.

We examined whether there were heterogeneous effects by gender. The results showed consistency on the impact of PRIMR across the measures except in grade 1 on the procedural index, where PRIMR showed an impact on males but not on females. The magnitude of the difference was small and the authors were unable to find examples in the materials or the training guides that would have systematically differentiated number identification outcomes by gender.

We investigated whether the effects of PRIMR on maths differed by public and nonformal school site and grade. Previous research had shown that the first-year impact of PRIMR on literacy outcomes did differ by site and grade (Piper, Zuilkowski, and Mugenda Citation2014) and by poverty levels (Piper, Jepkemei, and Kibukho Citation2015).

The results for mathematics in 2013 showed a pattern similar to what was identified in 2012 in literacy. PRIMR had a statistically significant impact on outcomes in nonformal schools in both grade 1 and grade 2 on the procedural index, and in grade 2 on the conceptual index. Similarly, there was no effect in the public schools on the conceptual index in grade 2. Unlike in nonformal schools, the public schools showed no effect in either grade 1 or 2 on the procedural index. A more detailed subtask-by-subtask analysis of PRIMR’s impact on mathematics in public schools showed some effects in grade 2 on most of the individual measures (Piper and Mugenda Citation2014). The larger impact on nonformal schools could have been due to the relatively lower level of teacher training and preparation prior to PRIMR implementation, which would mean that the modest training offered under PRIMR had a stronger effect there. As a rule, teachers in nonformal schools are less experienced and have not participated in as wide a range of teacher preparation courses as their counterparts in public schools. These findings suggest that in the PRIMR schools, it was easier to improve the mathematics teaching of inexperienced and untrained teachers. This argument fits well with the data collected by the TAC tutors and coaches in PRIMR public and nonformal schools, as the public schools covered fewer lessons and TAC tutors observed fewer classrooms than did the nonformal coaches. Given that grade 2 pupils in public schools showed evidence of a PRIMR impact on the conceptual index, but grade 1 pupils in public schools did not, it may have been true that the impact of PRIMR required more time to become evident in pupil outcomes.

PRIMR results showed that the impact on the procedural index (0.29 SD) and the conceptual index (0.24 SD) was smaller than the impacts on reading in either English (0.46 SD) or Kiswahili (0.35 SD). The effect size in the USAID-funded programme described in this paper was somewhat smaller than that identified in the DFID-funded study of PRIMR implemented in two different counties in a separate research design (Piper and Oyanga Citation2014; RTI International Citation2015). The literature differs on whether numeracy or literacy outcomes are more sensitive to intervention focused on training and instructional support, with the balance of articles suggesting that mathematics is more sensitive to initial teacher training programmes in sub-Saharan Africa (Piper Citation2009). It might not be solely a subject difference that explains the differential impacts in maths, Kiswahili and English, however. It must be reiterated that the PRIMR maths programme began in July 2012 rather than at the very start of the school year in January 2012, such that some of the difference in impact by subject could have been due to the more limited duration of implementation in maths. Similarly, PRIMR’s 10-day training programme for classroom teachers included only three days of mathematics training per year, and the mathematics training typically was held on the last day of the trainings held during each of the three terms. This time allocation might have implied a smaller emphasis on mathematics to the programme implementers, and therefore, the effect might be smaller as a result of programme implementation decisions rather than subject differences. As discussed above, teachers enjoyed fewer observations from TAC tutors in mathematics than in literacy, and this also could explain the somewhat smaller impact than in literacy.

5. Conclusions

The moderate impact of PRIMR on mathematics outcomes has implications for policy in Kenya. The MoEST’s 2015–2017 Global Partnership for Education grant request to cover several areas of educational quality improvement was recently approved, including a US$40 million investment in a numeracy programme. The GPE Project Appraisal Document draft noted that the MoEST planned to scale up the PRIMR mathematics programme design and implementation efforts. This included the PRIMR materials, which were updated again and improved after KICD review and revision. It also included the instructional supervision strategies that PRIMR had used for both the literacy programme and the mathematics programme described here. Previous research showed that the PRIMR literacy intervention was cost-effective (Piper, Zuilkowski, and Mugenda Citation2014). PRIMR’s mathematics impact, combined with the low per-pupil cost of the mathematics intervention, showed that significant gains can be made on learning outcomes with relatively modest financial investments.

Few studies using rigorous research designs have shown a positive impact of instructional improvement strategies on mathematics outcomes in the developing world.Footnote10 This research fills an important gap in the literature, as the PRIMR mathematics model has yielded some promising initial results. The design of PRIMR was such that the classroom support structures for public schools used the existing Kenyan personnel funded by the Government of Kenya, and the ability of the TAC tutors to implement the programme contributed to the decision to increase the number of TAC tutors and to focus a greater percentage of their attention and time on instructional support. These promising findings suggest more research should be undertaken to better understand how and whether instructional interventions can have an impact on pupils’ conceptual learning, and whether gains in those indicators require more time to register, trailing impacts on procedural outcomes.

Acknowledgements

The authors acknowledge the Kenyan Ministry of Education, Science and Technology, including the PRIMR Programme Development and Implementation Team and the Tusome National Technical Team housed at the MoEST and its members from across the Ministry and other organisations. The USAID/Kenya education team, specifically Dr Dwaine Lee, Dr Christine Pagen and Dr Theresiah Gathenya, designed a research study worthy of analysis. From a technical perspective, leadership and support from Jessica Mejia, Joseph DeStefano and Melinda Taylor were instrumental in this research and the PRIMR Initiative.

Disclosure statement

The lead author served as the Chief of Party of the PRIMR Initiative and the co-authors worked with RTI International, the implementer of PRIMR.

Additional information

Funding

This research was made possible by a Professional Development Award from RTI International and the President of RTI, Dr Wayne Holden.

Notes

1. Nonformal schools in Kenya are also sometimes called low-cost private schools or complementary schools. They are most often found in urban slum settings. By contrast, public schools are owned and funded by the government.

2. The Kenya Institute of Education later was reorganised and renamed the KICD.

3. The PRIMR literacy programme was designed as a bilingual programme that specified creating interactions between the two languages of Kiswahili and English. This meant that new letters were taught first in Kiswahili prior to English, and that ideas and concepts were first taught in Kiswahili prior to English. Differences between reading and maths: There was no workbook for reading and the students primarily used the books in class – without writing in them – to practise their decoding and fluency skills. Similarities: Both programmes had one page of content per day, the teachers used a teachers’ guide that related to the student book, and the daily lessons consisted of several mini lessons that would reinforce skills across a variety of areas.

4. The multi-subject design of PRIMR was logical in that pupils do not learn subjects in isolation, but it made it more difficult to establish whether the impacts of PRIMR identified in this paper were due to the mathematics programme itself or due to the symbiotic relationship between the literacy and numeracy improvement. There were several ways in which the literacy programme could have been responsible for at least some of the PRIMR mathematics gains. An example is the modest impacts on mathematics achievement in October 2012 despite less than six weeks’ implementation (Piper and Mugenda Citation2013). In addition, key instructional methods from the direct instruction literature were taught across the three subjects.On the other hand, in some ways the PRIMR maths programme was relatively discrete from the literacy programme and might be responsible for most of the PRIMR maths effect. An example is the provision to pupils of the maths activity book but not a parallel reading/writing book. Moreover, especially in terms of the direct instructional techniques, the maths programme had elements that were similar to the literacy programme, and there is no evidence to suggest that the directionality of the relationship favoured numeracy, as it could have gone in either direction (or both).

5. More detailed comparisons between public and nonformal schools can be found in Table 2 on page 13 in Piper, Zuilkowski, and Mugenda (Citation2014).

6. Two other instruments were administered alongside the EGMA: The Early Grade Reading Assessment (EGRA), to measure pupil performance in English and Kiswahili; and the Snapshot of School Management Effectiveness (SSME), consisting of structured interviews and classroom observations. Also note that to avoid test leakage and its ramifications, PRIMR re-randomised the order of items for the midterm and endline assessments, as well as introducing pre-equated, slightly different content for some subtasks.

7. The timed procedural variables measured the number of correct items per minute.

8. PRIMR used assessors who had been engaged in collecting data in Kenyan schools using the EGMA since 2009.

9. The student wealth variable was a composite wealth index simply derived as the sum of the number of household items the student reported to have. The items included a radio, a telephone or mobile phone, a television, a refrigerator, a toilet, a bicycle, a motorcycle and a vehicle.

10. A limitation of the study was that the results presented here include the outcomes at the end of the 2013 academic year, as measured by the endline assessment. The PRIMR mathematics team undertook significant revisions to the maths programme for the start of the academic year in January 2014, when the last (control) cohort was introduced to the programme and all participating schools received a final round of training and updated materials. Changes to the classroom materials included simplifying the daily content, redesigning the PRIMR maths teachers’ guide to remove much of the scripting, and using full colour in the pupil books. No funding was in place to carry out a ‘post-endline’ evaluation of the effects of these modifications on pupil performance.

References

Appendix. Sample Pages from PRIMR Pupil Activity Book and Teachers’ Guide, Grade 2 Mathematics