Publication Cover
Journal of School Choice
International Research and Reform
Volume 16, 2022 - Issue 1
7,658
Views
1
CrossRef citations to date
0
Altmetric
Articles

Standardized Test Proficiency in Public Montessori Schools

ORCID Icon, ORCID Icon & ORCID Icon

ABSTRACT

Although Montessori is the most common unconventional education model, no multi-state study has compared standardized test proficiency of Montessori schools with districts. Here we report on this for the 10 states/regions with the most public Montessori schools (n = 195). In 3rd grade, Montessori schools were less proficient in math but more proficient in ELA. In 8th grade they were also more proficient on ELA and showed a trend to greater proficiency in math. Black, Hispanic, and economically disadvantaged students at Montessori schools were more proficient on ELA tests, and performed better or similarly on math tests, at both grade levels. Achievement gaps were generally smaller. Difference in percent proficient in 8th grade controlling for 3rd grade was consistently greater at Montessori schools than in districts. Potential reasons for the different performance of Montessori schools are discussed.

For better or for worse, standardized testing has taken on enormous importance in American schools. Although there are arguments that standardized testing has changed schooling for the worse (Au, Citation2013; Darling-Hammond, Citation2007; Haberman, Citation2010; Heissel, Adam, Doleac, Figlio, & Meer, Citation2019; Jones, Citation2007; Zhao, Citation2016, Citation2017), higher scores do predict better life outcomes, like higher earnings (Goldhaber & Özek, Citation2019; Hanushek, Citation2019). Furthermore, careful research suggests this may be causal: Having a teacher who raises test scores leads to better life outcomes for children (Chetty, Friedman, & Rockoff, Citation2014a, Citation2014b). These effects are small (for example, a one standard deviation improvement in teacher efficacy in a single grade raises age-28 earnings by 1.3%), but small effects can compound into meaningful differences. This does not necessarily mean that standardized tests have been a net positive, but such effects are used as a justification for them.

Researchers have also studied whether particular school programs, like “no-excuses” charter (Cheng, Hitt, Kisida, & Mills, Citation2017) or Waldorf (Larrison, Daly, & VanVooren, Citation2012) programs, produce or are at least associated with higher test scores. Waldorf is an example of a pedagogically progressive school program. Progressive schools use curricula and methods that individualize most instruction, often using project-based learning, and de-emphasize testing and grades (Tyack & Cuban, Citation1995, pp. 194–8). They take a constructivist stance, in that children are viewed as constructing their own understandings (Brooks & Brooks, Citation2001).

The most common and long-lasting progressive school model is Montessori (Debs, Citation2019), yet it has been studied very little, and the results (summarized below) have been mixed. There are both private and public Montessori schools in the United States; private ones charge tuition and are not required to administer state tests, whereas public ones are funded through taxes and must administer those tests and post results publicly. Therefore, we focus on public Montessori schools. Although they represent fewer than a fifth of all US Montessori schools, there are currently over 500 public Montessori schools in the United States (Debs, Citation2016c; National Center for Montessori in the Public Sector [NCMPS], Citation2014). A 2016 study estimated that 59% of these public Montessori schools were magnet or district schools, and the others were charter schools, with an average school size of 315 students (Debs, Citation2016c). Because public Montessori schools serve well over 100,000 children in 42 states,Footnote1 and the number of public Montessori schools has more than doubled in recent years (National Center for Montessori in the Public Sector [NCMPS], Citation2014), it is important to know how they perform, and proficiency on state tests is one metric. Below we describe Montessori education for unfamiliar readers, consider theoretical and empirical reasons why Montessori might or might not be associated with higher test scores, and then present the study. First, we acknowledge two weaknesses of this study.

The first weakness of this study is our data type. Obtaining district-level permission to view individual child test scores is difficult and costly, even within a state (Jacob et al., 2014). Privacy concerns mean many districts will not share them. Researchers have accessed individual child scores, but typically only for single districts or school types. As a first step, the approach taken here was to use publicly available aggregated data. Such data are one way to gauge school performance (Jacob et al., 2014). Here we use percent proficient metrics, which are highly problematic in some circumstances since states can change cutoff scores from year to year (Ho, Citation2008; Holland, Citation2002); those circumstances do not apply here because we compared schools with their districts, for which benchmarks are the same. The advantages of using percent proficient are that it allows cross state comparisons (since many states do not report scaled scores) and has the same range across topics, grades, and districts. Thus, we record the percent of students proficient in math and English Language Arts (ELA, or its equivalent such as reading) for 3rd grade, the lowest grade when tests are mandated, and for 8th grade, the highest grade when successive annual tests are mandated. We did not use scores from high school because very few public Montessori schools extend past middle school. Although states set their own benchmarks, Montessori schools are compared with all non-Montessori schools in their surrounding districts, controlling for demographics. This is not a perfect approach; it provides one angle on the issue. An attendant strength is that it allows us to examine performance at almost 200 Montessori schools across many states.

Second, selection effects are a weakness of this study, since most public Montessori schools are choice schools. Montessori schools tend to have slightly wealthier and Whiter populations because those families seek out Montessori (Debs, Citation2019). Although these obvious features can be (and are, in this study) controlled for, other hidden characteristics like (perhaps) family values that lead one to choose Montessori might be responsible for any effects found. In other words, if Montessori students do differently on standardized tests, it might be because the family does or does not value some characteristic, like academic achievement, and this value causes children’s scores. We do not know any ways in which Montessori parents systematically differ on unobserved variables like this. Two studies gave Montessori and conventional school parents many different questionnaires (personality, values, parenting styles, etc.) to reveal such differences but failed to find any (Denervaud, Knebel, Immordino-Yang, & Hagmann, Citation2020; Dreyer & Rigler, Citation1969). Random lottery studies are the only sure way to control for unobserved differences, but there are to our knowledge are only three random lottery studies of Montessori education. Two used high fidelity public Montessori programs, and both found better results in preschool; the one that also tested 6th graders found that Montessori students did significantly better only on a writing task and social measures, but not on several Woodcock Johnson tests including Applied (Math) Problems and Letter Word (Lillard & Else-Quest, Citation2006; Lillard et al., Citation2017). The most recent lottery study used French public preschool classrooms that lacked many Montessori materials (particularly math ones) and where the teachers had no formal Montessori training; under these circumstances, Montessori students had superior results in literacy only (Courtier et al., Citation2021). Overall, lottery studies suggest Montessori schooling can cause better performance at earlier ages, but only one lottery control study has included older children. It had only a few positive results. Although lottery studies are ideal to address selection effects, they are rarely feasible, and other studies can still be of interest.

Our purpose is to see if the percentages of children meeting their state’s standard for proficiency differ at Montessori schools because if they do differ, it should encourage further research examining change in individuals’ scores over time (as Culclasure, Fleming, & Riga, Citation2018 did for South Carolina public Montessori programs) and/or encourage using scaled scores from the subset of states which report them. Although the present study cannot provide evidence that schools cause changes in test scores, it could lay the groundwork to see if a clearer causal design is worth implementing.

Montessori schooling

Montessori education is a complex, deeply thought-out system developed in the early decades of the last century by an Italian physician and her collaborators, based on close observation of children and their responses to curated materials and an evolving set of environmental conditions (Lillard & McHugh, Citation2019a, Citation2019b; Montessori, Citation2017). The children Montessori initially observed were atypically developing; the next group consisted of lower income students; and eventually Montessori worked with children of all social classes in Europe, America, and Asia, including 9 years in India during and after World War II. Her biology background engendered a systems view in which development is the unfurling of new capacities, adapting organisms to the environment at multiple levels. Hence, the Montessori educational environment is carefully constructed to respond to developmental needs at each stage of development, resulting in classrooms for children under age 3 and for each subsequent 3-year age span. The teacher’s role is not to tell information so much as to connect children to materials in the environment, which are self-correcting (meaning students can recognize their mistakes themselves, without teacher correction) and enlighten children through their repeated use. Montessori education leaves children free to choose their own activities, reasoning that an unperturbed organism will be in touch with its own developmental needs. Free choice comes with the caveat that one makes constructive choices, which (in Montessori theory) an undisrupted organism will. Furthermore, the freedom a child has in Montessori exists in the context of a tight structure; there are particular ways to use the materials, for example, and limitations on what choices a child can make, but a child is free to make any of the permitted choices. There are no grades or gold stars as Montessori believed these would perturb development; instead, learning is its own reward. There are also no explicit tests; the goal of Montessori education is human development, not acquisition of information that is amenable to testing. However, despite this different goal, public Montessori schools must administer standardized tests annually in accordance with federal law, making them sites for the current study. For further insight on the Montessori approach, readers are directed to videos at Montessori-guide.org, a logic model (Culclasure, Daoust, Cote, & Zoll, Citation2019), and books by Maria Montessori.

Why Montessori might differ in levels of test score proficiency

There are both theoretical and empirical reasons to expect public Montessori schools to have different levels of proficiency on the standardized tests that all public schools must administer. Theoretically, Montessori has several features (Culclasure et al., Citation2019) that have been shown in empirical work to improve academic and social-emotional outcomes; having better social-emotional outcomes has, in turn, been associated with better academic outcomes (Blair & Raver, Citation2015). For example, Lillard (Citation2017) explained how Montessori involves embodied cognition (learning occurs through acting on materials), self-determination and free choice, learning topics of personal interest, cultivating one’s executive function, contextualized learning, ample peer engagement, a highly organized environment, and warm, loving adults who tightly structure children’s experiences–all features with strong evidential support for improved outcomes (see also Darling-Hammond, Flook, Cook-Harvey, Barron, & Osher, Citation2020; National Academies of Sciences Engineering and Medicine [NASEM], Citation2018).

On the other hand, Montessori is sometimes criticized for being too free, like discovery learning programs, which are typically not successful (Alfieri, Brooks, Aldrich, & Tenenbaum, Citation2011; Chien et al., Citation2010; Klahr & Nigam, Citation2004; Mayer, Citation2004; Stockard, Wood, Coughlin, & Rasplica Khoury, Citation2018; Winsler & Carlton, Citation2010). Montessori has also been criticized for a lack of pretend play, sometimes considered crucial for development (Elkind, Citation1983; but see Lillard et al., Citation2013; Lillard & Taggart, Citation2019). Some research shows negative effects of mixed-age classrooms (Checchi & De Paola, Citation2018), which could suggest Montessori schools would do less well on performance measures. Further, frequent testing is associated with better learning (Yang, Luo, Vadillo, Yu, & Shanks, Citation2021), and Montessori minimizes testing. For all these reasons, one might expect children at Montessori schools to perform worse on standardized tests.

Moving to empirical reasons, one might expect better performance because some empirical studies have shown better outcomes for children in public Montessori schools. The most stringent are the lottery control studies already reviewed, which showed better or similar outcomes, but only one went beyond kindergarten and it showed positive results on social and writing measures but not math and reading. However, the sample was small and limited to one Montessori school. In addition, the lottery studies used outcomes selected by researchers (like the Woodcock–Johnson test), not mandated state exams, and researcher-selected tests are typically associated with larger effects (Kraft, Citation2020).

Other studies used state exams but did not have lottery waitlist controls; they had mixed results. One study examined a Montessori school in “a large urban district in western New York” (Lopata, Wallace, & Finn, Citation2005, p. 8) state. Because there was only one public Montessori school fitting that description at the time, clearly it was the public Montessori school in Buffalo, whose website (viewed by the last author in 2006) indicated it was using non-Montessori practices like homework, grades, and specials. Using standardized scores for math and ELA, researchers found worse results in ELA for Montessori in 8th grade but no difference in math, nor in either subject in 4th grade (Lopata et al., Citation2005). Another study compared more than 500 students in two public Montessori schools to a similar number of students in other demographically comparable public schools in the same district (Mallett & Schroeder, Citation2015). No significant differences were found in reading or math in first, second, or third grades, suggesting the samples were equal at the outset. Montessori students scored significantly higher in both reading and math in fourth and fifth grades. As these studies are limited to one or two Montessori schools, they may reflect more about the schools themselves than Montessori pedagogy.

In one of the only long-range studies, Dohrmann, Nishida, Gartner, Lipsky, and Grimm (Citation2007) examined high school students from a single district (Milwaukee Public Schools) who had attended one of two Montessori schools through 5th grade. The Montessori alumni performed better in math/science than their high school classmates but similarly in English/social studies.

The largest study (Culclasure et al., Citation2018) of Montessori pedagogy’s effect on standardized test scores involved over 7,000 children in the state of South Carolina. The study examined third through eighth - grade standardized test scores across three school years for Montessori programs in 45 public schools, including charter and traditional public schools; most were Title-I schools. One set of analyses simply compared the scores to those of all other public school students in the state, controlling for gender, race, ESL status, special education, and free/reduced lunch status. In a separate set of analyses, Montessori students were matched with other public school students on those demographics and their prior year’s scores in each subject. In both sets of analyses, Montessori students performed significantly better on ELA in every year; in math, they performed better in the matched analyses on two of the three years. In fact, across 28 different analyses, 16 were significantly different, and 15 of these favored Montessori. In 9 of 14 sets of analyses, contrary to what one might expect, larger effect sizes were obtained when samples were matched than when state scores were used for the comparison (Culclasure et al., Citation2018). This study also found compellingly better performance for Black and lower income students attending Montessori schools.

Although this last study and the lottery studies control (at least to a degree) for selection effects, the present study does not, and (as stated earlier) self-selection is another theoretical reason why Montessori could be associated with higher test scores. Montessori is very rarely the default option; families must select into them. Although two studies (Denervaud et al., Citation2020; Dreyer & Rigler, Citation1969) seeking specific characteristics of Montessori-selection families failed to find any, we expect there are some. If unobserved variables associated with selecting Montessori schools lead to higher test scores, then one would expect to see Montessori students score more highly at all ages. For example, families that read more might be particularly drawn to Montessori, and the home reading might lead to higher literacy proficiency at Montessori schools at both third and eighth grade; alternatively, families that engage in more home math activities might be drawn to Montessori, leading to higher levels of math proficiency at both levels. The aforementioned lottery studies (Lillard & Else-Quest, Citation2006; Lillard et al., Citation2017; for literacy only, Courtier et al., Citation2021) and the well-controlled Culclasure et al. (Citation2018) study suggest that the curriculum itself causes higher scores in both domains, but the present study design cannot support causal claims.

Student body composition differences in Montessori and non-Montessori schools

Black, Hispanic, and lower income children in the United States often do less well on academic achievement measures than White children and middle-income children (Ansari & Winsler, Citation2014; Carnevale, Fasules, Quinn, & Peltier Campbell, Citation2019; Duncan, Magnuson, Murnane, & Votruba-Drzal, Citation2019). While the reasons for this are complex and widely believed to stem from opportunity gaps (Duncan & Murane, Citation2014), the goal to eliminate them is incontestable, which raises the question whether different subgroups perform differently in Montessori. There are theoretical reasons why they might. As mentioned, in Montessori education much of the teaching is done by materials, not people. Teacher expectations are an important determinant of children’s progress in conventional settings (Good, Sterzinger, & Lavigne, Citation2018), and reducing the role of the teacher in conveying learning might particularly help children who are often recipients of bias (Dee & Gershenson, Citation2017). Second, Montessori practices “looping” in which children and teachers stay together for 3 years. This has been shown to be helpful to all children, with some of the largest gains being made by what the study described as “minority” children (Hill & Jones, Citation2018). Over half of children in public Montessori schools are students of color (Debs, Citation2016c), and some studies suggest that such children might in fact do particularly well in Montessori settings. In addition to the Culclasure et al. (Citation2018) study mentioned above, Brown and Lewis (Citation2017) compared eight years (2006–2014) of third-grade reading and math scores from African American students at three Montessori magnet and conventional magnet schools matched for demographics. Montessori students outperformed their district peers in reading. No significant differences were found between the Montessori and conventional schools for math; on average, both the Montessori and the control schools (which included a STEM school) performed .57 standard deviations above the district mean. Another study suggested that less racial disproportionality in discipline exists in public Montessori schools (Brown & Steele, Citation2015). Since discipline typically means suspension from class, this suggests Black children miss proportionately less class time in Montessori, which might assist learning. These results suggest that Montessori might be beneficial for Black students in particular. CitationAnsari and Winsler (2014, Citation2020) found that while a single year of a Montessori preschool was associated with higher gains than a single year of HighScopeFootnote2 for Hispanic students in kindergarten and 3rd grade, for Black students the two programs were equally beneficial. In sum, although in theory Montessori could better support children of color, results for Black students are mixed, and more data on whether they do better in Montessori is important to gather. For Hispanic children, there are very few published results, and getting more information is warranted.

In addition to different racial groups, how economically disadvantaged children fare in Montessori schools is of interest. All three of the lottery studies involved economically disadvantaged children. In one longitudinal lottery study cited above, the academic scores of the lower income half of the Montessori sample approached those of higher income children by the end of kindergarten (Lillard et al., Citation2017). The samples in both of the other lottery studies included significant proportions of lower income children and both showed positive results from Montessori in reading (Courtier et al., Citation2021); one showed positive results in math and social-emotional development as well (Lillard & Else-Quest, Citation2006). Based on the empirical literature and theory, we hypothesized that economically disadvantaged, Black, and Hispanic children would do relatively well in Montessori, and also show smaller gaps with higher income and White children, respectively.

Summary

Overall, empirical studies are inconsistent regarding Montessori school performance on standardized tests, and there are theoretical reasons to think many different patterns of results are possible. It is important to extend analysis beyond single schools and districts to multiple states. We did so using internet-published state databases from US states with many public Montessori schools. Because consecutive federally mandated tests begin in 3rd grade and end in 8th grade, and our resources were limited, we focused on 3rd and 8th grade scores in the 10 states/regions with the most public Montessori schools. Because results at small schools can vary widely across consecutive years and some public Montessori schools are small, we collected them for the most recent three pre-Covid years (2017–19). We examined both overall proficiency and the difference in proficiency levels at 8th grade when accounting for proficiency in 3rd grade, as well as data disaggregated for three subgroups: Black, Hispanic, and economically disadvantaged. Achievement gaps in percents proficient were examined as well.

Method

School selection

First, we identified the 10 states with the most public Montessori schools by consulting the Montessori Census, an online database of US Montessori programs (National Center for Montessori in the Public Sector [NCMPS], Citation2019). Results were filtered to include only public schools with 3rd and/or 8th grade Montessori as whole school programs (some otherwise conventional schools include isolated Montessori classrooms). A count of the number of schools meeting criteria for all 50 states and Washington, D.C. revealed that California, South Carolina, Michigan, Colorado, Texas, Florida, Arizona, Minnesota, Wisconsin, and the metropolitan Washington, D.C. region (including Maryland and Virginia public Montessori schools) had the most Montessori schools. Although fairly complete, the Montessori Census does not include every public Montessori school. We then supplemented it using the records of the 2012-13 Whole-School Montessori and the American Public Montessori Historical data sets for those states (Debs, Citation2016a, Citation2016b). This resulted in 229 schools whose websites were checked to ensure they were whole school Montessori programs at 3rd and/or 8th grade; this eliminated 33 schools, and a 34th was excluded because it closed during data collection. The final sample was 195 Montessori schools.

Next, each school’s district was identified by searching the web for the school and for publicly available state databases. Comparisons were made between Montessori schools and their districts rather than with specific non-Montessori schools because we did not have adequate information to allow us to accurately pair individual schools, nor is it clear that doing so would be desirable given that individual schools can offer different types of programs; comparison with the district average gives a comparison of Montessori with all district non-Montessori schools. If a Montessori school was the only school in its district (as for some charter schools), the geographically closest district was used. Because some districts have more than one Montessori school, the comparison set included 119 districts. Data were collected for three school years (2016–2017, 2017–2018, and 2018–2019) from publicly available school report card databases (Arizona Department of Education [ADE], Citation2019; California Assessment of Student Performance and Progress [CASPP], Citation2019; Colorado Department of Education [CDE], Citation2019; Florida Department of Education [FDE], Citation2019; Maryland State Department of Education [MSDE], Citation2019; Michigan School Data [MSD], Citation2019; Minnesota Department of Education [MDE], Citation2019; National Center for Education Statistics [NCES], Citation2019; Office of the State Superintendent of Education: DC [OSSE], Citation2019; South Carolina Department of Education [SCDE], Citation2020; Texas Education Agency [TEA], Citation2020; Virginia Department of Education [VDE], Citation2019; Wisconsin Department of Public Instruction [WDPI], Citation2020). District data were statistically adjusted to remove the Montessori contribution by using the following equation:

Mb=Mcp×Ma/1p

where Mb was the percent of students proficient in the district not enrolled in Montessori, Mc was the percent of students proficient in the school district (including Montessori students), Ma was the percent of students proficient in the Montessori school, and p was the percent of Montessori students enrolled in the district. When there was more than one Montessori school, the equation was expanded upon as follows:

Mb=Mcp1×Ma1p2×Ma2/1p1p2

Demographics

Ninety-nine Montessori schools had both 3rd and 8th grades, 92 included just 3rd, and 4 included just 8th grade. For each school and district that reported the information, the percentage of students by race and socioeconomic status were recorded. For analysis, the percentage of students historically affected by the racial opportunity gap was calculated. This group included American Indian/Alaskan Native, Black, Hispanic, and Hawaiian/Pacific Islander students. Multiple race students were included in this group as well. Researchers have documented the opportunity gaps between each of these groups of students as compared to White students, even when socio-economic status is the same (Carnevale et al., Citation2019; Pang, Han, & Pang, Citation2011; Powers, Citation2005). Asian students were not included in this group because although Asian children do face racial discrimination in the United States, research has shown that Asian children perform as well or better than their White peers on measures of academic achievement (Pang et al., Citation2011; Powers, Citation2005). On average, in the Montessori schools 45.75% of the students were historically affected by the racial opportunity gap, compared to 49.78% for districts, t(937) = 2.30, p = .02, d = .15. The percentage of economically disadvantaged students (as determined by each state) also differed between the Montessori schools (M = 38.94%) and districts (M = 47.09%; t[858.35] = 5.42, p < .001, d = .36). Because these differences are themselves associated with levels of proficiency in state tests, they were accounted for in analyses.

Academic performance

Academic performance was operationalized as the percentage of students proficient on ELA (ELA) and math state - standardized tests in 3rd and 8th grades. Although each state sets its own proficiency benchmarks, states also use different tests and change tests periodically so this metric enables comparisons when individual level standardized scores are not available. As noted, district proficiency levels were recalculated to remove Montessori school performance.

Treatment of missing data

Data are withheld when a group is small, leaving some missing data. Where other Montessori schools in the district reported data, values were imputed with regression to create an estimated score. These imputed scores were used to remove Montessori student performance from and correct the district data, but they were not included as Montessori school scores in analyses. If all the Montessori schools in the same district were missing a datapoint, the district’s matching datapoint was eliminated.

Analysis

Except where noted, all data were analyzed using repeated measures analyses of covariance (ANCOVAs) in SPSS or R, covarying percentages of students historically affected by the racial opportunity gap and of economically disadvantaged students. These were followed up with Tukey’s post-hoc comparisons, and effect sizes were calculated with adjusted means. The basic ANCOVA equation was:

yi=b0+b1×Mi+b2×Racei+b3×Incomei+i,

where yi represents the observed reading/math score for the ith school at grade 3 or grade 8, Mi represents whether the ith school is a Montessori school (1: yes; 0: no), and i is the error term. The coefficient b1 was tested to see how Montessori affected scores at grade 3 or 8.

The normality and equality of variance assumptions for the ANCOVA procedure were checked before running the analysis (see Appendix). The data were symmetric and balanced, but there was more variance in the Montessori data as would be expected since (1) there were more Montessori scores and (2) districts are composed of many schools, which would itself lead to smaller variance within districts across years. However, ANCOVA is robust to the violation of normality and homogeneity of variance assumptions, especially when data are symmetric and balanced, so it is appropriate to use ANCOVA. Still, to check if this influenced results, our first set of models was checked against models that added grade level enrollment data as a covariate.

Results

First, we examine overall proficiency of Montessori schools and districts on standardized ELA and math tests at 3rd and then at 8th grade. Next, we examine this for subgroups of Black, Hispanic, and economically disadvantaged students. We then examine Black/White and economic achievement gaps. Finally, we examine the percent proficient in 8th grade in each school/district controlling for the percent proficient in 3rd grade in that same school/district.

Enrollment is greater in districts than in schools. To examine the effect of enrollment size, we ran all models with and without enrollment as a covariate. There was very little change in model fit; therefore, we reported the original results that did not account for enrollment.

Overall results

Controlling for the percentage of economically disadvantaged students and those historically affected by the racial opportunity gap (as defined here), there was a significant main effect indicating that a higher percentage of non-Montessori students were proficient on state standardized tests at 3rd grade, F(1, 816) = 7.94, p = .005, ηp2 = .01. There was also an interaction between Montessori (yes/no) and test type (math/ELA), F(1, 816) = 143.36, p < .001, ηp2 = .15. Follow-up tests indicated that a slightly greater percentage of Montessori students were proficient in 3rd grade ELA (est. marginal means, 47.05% vs. 44.95%, mean difference = 2.10, SE = 1.04, p = .044, d = 0.13), and a much larger percentage of district students were proficient in 3rd grade math (est. marginal means, 43.90% vs. 52.04%, mean difference = −8.14, SE = 1.19, p < .001, d = .47). Effect sizes for the overall and subgroup results are in .

short-legendFigure 1.
Summary of Findings by Effect SizesNote. The figure indicates that proficiency levels were, in most cases, at least somewhat higher at Montessori schools. The exceptions were for 3rd grade math, where only Black students obtained higher levels of proficiency at Montessori than at other district schools; other marginalized students performed the same, and overall Montessori students performed worse for 3rd grade math but showed a trend to better performance in 8th grade math. * <.05; ** < .001.

In 8th grade, a greater percentage of students in Montessori school were likely to test above proficiency thresholds, F(1, 349) = 10.58, p = .001, ηp2 = .03. There was also an interaction between Montessori (yes/no) and test type (math/ELA), F(1, 349) = 5.36, p = .021, ηp2 = .015. A significantly greater percentage of Montessori students were proficient in ELA (est. marginal means, 53.38% vs. 45.70%, mean difference = 7.68, SE = 1.58, p < .001, d = .44), but the greater percentage of Montessori students who were proficient in math (39.41%, vs. 36.07%) was not significant (mean difference = 3.34, SE = 2.03, p = .10, d = .17).

Subsamples

Black

Again controlling for the covariates of economic disadvantage and race, among Black students, a significant main effect in 3rd grade indicated a higher percentage of Black students were proficient on the tests at Montessori schools than in the other district schools as a whole, F(1, 147) = 8.65, p < .05, ηp2 = 0.06. There was no significant interaction between Montessori and the test. A group difference was observed for Black students on both 3rd grade ELA (est. marginal mean 36.16% vs. 27.17%, mean difference = 9.00, SE = 2.78, p = .001, d = .58) and 3rd grade math tests (est. marginal means 38.01% vs. 30.44%, mean difference = 7.57, SE = 3.31, p = .023, d = .41). At 8th grade, although effect sizes were larger, smaller sample sizes rendered the main effect and interaction between pedagogy and test type insignificant. Because the effect sizes were large (Kraft, Citation2020), exploratory follow-up tests were conducted. A greater percentage of Black Montessori students were proficient on 8th grade ELA tests (est. marginal means 41.91% vs. 28.74%, mean difference = 13.18, SE = 5.43, p < .018, d = .68), and a trend (with a large effect size) were observed for math (est. marginal means 32.82% vs. 18.83%, mean difference = 14.00, SE = 7.37, p = .066, d = 0.72).

Hispanic

There was no significant main effect for Hispanic students in 3rd grade, but there was an interaction between Montessori (yes/no) and test type (math/ELA), F(280) = 7.65, p = .006, ηp2 = .03. A significantly greater percentage of Hispanic Montessori students were proficient on 3rd grade ELA tests (est. marginal means, 43.54% vs. 39.02%, mean difference = 4.52, SE = 2.00, p = .024, d = 0.28), but no significant difference was found for math. At 8th grade as well, there was a significant main effect, F(1, 78) = 7.04, p = .010, ηp2 = .08, but no Montessori X Test interaction. A significantly greater percentage of Hispanic 8th grade Montessori students were proficient on ELA tests (est. marginal means 52.12% vs. 38.88%, mean difference = 13.23, SE = 3.68, p = .001, d = .78), but there was no difference for 8th grade math.

Economic disadvantage

For economically disadvantaged students, at 3rd grade there was no significant main effect, but there was an interaction between Montessori (yes/no) and test type (math/ELA), F(1,381) = 15.52, p < .001, ηp2 = 0.04. Significantly more economically disadvantaged Montessori students were proficient on 3rd grade ELA tests (est. marginal means 38.82% vs. 34.19%, mean difference = 4.63, SE = 1.54, p = .003, d = .31), but no difference was observed for math. In 8th grade, economically disadvantaged Montessori students were more proficient overall, F(1,102) = 6.82, p = .010, ηp2 = .06, but there was no Montessori X Test interaction. The Montessori advantage was observed on 8th grade ELA tests (est. marginal means 41.89% vs. 32.68%, mean difference = 9.21, SE = 3.02, p = .003, d = .54), but was not significant for 8th grade math (est. marginal means 33.09% vs. 27.18%, mean difference = 5.91, SE = 3.62, p = .105, d = .33).

Achievement gaps

Achievement gaps were examined between Black and White students and economically disadvantaged and non-economically disadvantaged students. To calculate the achievement gap, we subtracted the percentage of students proficient in one subgroup from the percentage proficient of students in the other group in the same school or district.

For 3rd grade ELA, the Black-White achievement gap was significantly smaller in Montessori than in other district schools (est. marginal means, 25.65% vs. 37.04%, mean difference = 11.39, SE = 3.15, p < .001, d = .77). The difference was insignificant for 3rd grade math. Achievement gap differences are shown in . There was insufficient data to conduct this analysis at 8th grade.

Figure 2. Achievement Gaps in Percent Proficient by Grade, Test, and Subgroup

* <.05; ** < .001.
Figure 2. Achievement Gaps in Percent Proficient by Grade, Test, and Subgroup

Next, we examined achievement gaps for economically disadvantaged subgroups. For 3rd grade ELA, the achievement gap in Montessori was significantly smaller than in district schools (est. marginal means, 17.16% vs. 28.97%, mean difference = 11.81, SE = 1.67, p < .001, d = .95). A significant effect was also found for 3rd grade math (est. marginal means, 18.22% for Montessori vs. 27.69% for district, mean difference = 9.47, SE = 1.85, p < .001, d = .73). For 8th grade ELA, the Montessori schools again had a significantly smaller gap (est. marginal means 20.30% vs. 28.26%, mean difference = 7.95, SE = 3.84, p = .042, d = .60). Montessori schools also had a significantly smaller gap in 8th grade math (est. marginal means 15.57% vs. 27.39%, mean difference = 11.82, SE = 3.41, p = .001, d =1.07). Thus, on both tests at both levels, economically disadvantaged students were more likely to be proficient if they were in Montessori schools.

Proficiency across grade levels

Differences in proficiency across the 3rd and 8th grades was examined last. This assumes that on average the children in a school or district’s 3rd grade classes are similar in most ways (other than age) to children in its 8th grade classes. For these analyses, we used a Bayesian approach to model estimation. Multiple imputations were used for missing data analysis. Because there were no significant differences in proficiency levels across the three school years, test performance was averaged across the three years to simplify this set of models and reduce missing values (important for this analysis since half of the schools had only one of the examined grade levels). We examined 8th grade performance while controlling for 3rd grade performance in the same school or district using ANCOVAs. The equation was:

yi=b0+b1×Mi+b2×xi+i,

where yi represents the observed reading/math score for the ith school at grade 8, Mi represents whether the ith school is a Montessori school (1: yes; 0: no), xi is the ELA/math score for the ith school at grade 3, and i is the error term. The coefficients b1 and b2 were tested to see how Montessori and scores in grade 3 affected scores at grade 8. As compared to their districts, Montessori schools demonstrated a significantly greater percentage of students meeting the proficiency standard for ELA at 8th grade after controlling for the school or district’s 3rd grade proficiency levels, ∆M = 8.75, SD = 1.73, 95% credible interval = 5.30– 12.11. This was true for math as well, ∆M = 10.59, SD = 2.67, 95% credible interval = 5.43– 15.88. These results are shown in .

short-legendFigure 3.
Difference in Percentage Proficient at 8th Grade Controlling 3rd Grade Note. District not labelled because all results show more change in percentages proficient at 8th grade controlling for 3rd grade in Montessori schools than in districts. For example, overall, Montessori schools’ percentages proficient changed, on average, 10.59% more (in a positive direction) than did districts’ percentages proficient in math.

The same pattern was observed for Black students. A greater percent of Black students in Montessori in 8th grade were proficient in both ELA and math after controlling for 3rd grade proficiencies: ∆MELA = 13.83, SD = 5.09, 95% credible interval = 3.83– 23.82; ∆MMath = 16.55, SD = 6.98, 95% credible interval = 2.36– 30.29. The pattern was observed for Hispanic students in ELA, ∆M = 10.93, SD = 3.41, 95% credible interval = 4.22– 17.76, but was insignificant for math. Finally, the proportion of economically disadvantaged students in Montessori schools was also significantly greater at 8th grade after accounting for 3rd grade on ELA tests, ∆M = 10.97, SD = 3.09, 95% credible interval = 4.92– 17.04, but not on math.

Again, all models were also run adding enrollment as a covariate; there was no significant improvement in model fit (in fact, because of missingness, model fit was typically reduced).

Discussion

Because past research on state test scores for children in Montessori is limited and has mixed results, this study examined publicly posted proficiency levels at a large number of schools in the 10 states/areas of the US with the most public whole-school Montessori programs. The study thus provides a broader picture of academic achievement on standardized tests in public Montessori schools than has been provided thus far. The figures show that children in Montessori schools typically exceed their district peers on standardized tests as viewed from several angles. In ELA tests, children in Montessori schools were more likely to be proficient than their district peers in every comparison, at both grades, overall and across all subgroups. In math, the Montessori advantage was significant for Black children in 3rd grade, but overall, in 3rd grade, Montessori students were less likely to be proficient in math. In 8th grade the sample sizes were smaller, but judging by effect sizes, math proficiency was higher in Montessori schools by at least one-third SD for all subgroups. Focusing on the Black-White achievement gap, in 3rd grade Montessori had a smaller gap than comparison districts in ELA; there was no difference in the size of the gap for 8th grade there were insufficient data for analysis. Montessori also had a smaller achievement gap for economic advantage/disadvantage at both grades, on both subjects.

These findings are consistent with past research indicating that lower income, Black, and Hispanic students perform particularly well in the Montessori model (Ansari & Winsler, Citation2014, 2020; Brown & Lewis, Citation2017; Lillard et al., Citation2017). Comparing Montessori schools to their districts in terms of the difference in proficiency levels across 3rd and 8th grades, Montessori schools overall had greater differences in percentage proficient on both math and ELA tests in 8th grade controlling for 3rd grade. We caution here that different children’s scores are calculated at each grade, but school grade is the participant here, and there is no reason to expect characteristics of the population of children that constitute a school grade changes markedly across these grades (with the obvious exception of age). The overall pattern of seeing a greater difference in the proportion of children proficient at 8th grade on both tests is also true for Black children. For Hispanic and for economically disadvantaged children, Montessori schools showed significantly greater differences than districts in proficiency levels in 8th grade controlling for 3rd grade on ELA but not math tests, although the percentage change in math proficiency was in both cases over 7% greater at Montessori schools.

This study examines correlations between standardized test performance and school type. It is possible that the Montessori program confers these advantages; that would be consistent with lottery control studies’ kindergarten-level results and most studies of scores in individual states and districts, including one that matched students and controlled for individual students’ prior scores to examine growth (Culclasure et al., Citation2018). However, it is also possible that the results reflect self-selection: that families whose children would do better at any school happen to select Montessori schools more often. One point against this is the 3rd grade math results, which strongly favored other district schools. One might surmise that families who read more to their children enroll them in Montessori, and families that engage in more math at home enroll their children at other schools. If that were true, the results showing Montessori schools fare somewhat better on 8th grade math might suggest program effects. Yet the premise that selection effects produced the reading scores is not well-supported, in that all three existing lottery studies of public Montessori schools (which control for selection effects) have shown better results on reading for children who attend Montessori, indicating that the Montessori program does influence reading skills. Furthermore, two of those lottery studies included baseline data indicating no differences at baseline. Yet those studies concerned preschool-aged children. Selection effects cannot be ruled out with the present design.

The alternate possibility is that the Montessori program caused the differences in scores. Here we consider why Montessori might raise ELA test performance, what might explain the math pattern of lower performance in 3rd grade but equal at 8th, and finally, what might explain the general advantage for Black, Hispanic, and economically disadvantaged children.

ELA test results

The finding that Montessori schools had higher ELA scores in every comparison in this study aligns with prior research showing that Montessori students do well in ELA. Although results in some other studies could also reflect selection (Brown & Lewis, Citation2017; Culclasure et al., Citation2018; Mallett & Schroeder, Citation2015), this is not the case for studies that used random lotteries, and thus selection was controlled for (Courtier et al., Citation2021; Lillard & Else-Quest, Citation2006; Lillard et al., Citation2017); in one of these, low-income Montessori students improved more in reading across preschool than controls even though their Montessori teachers had no formal Montessori training (Courtier et al., Citation2021). The method with which reading is taught in Montessori is well-supported by other research, which could explain these findings. For one, children learn to write before they learn to read, making their learning of the letter symbols and sounds “embodied,” and embodied learning is typically superior to learning that does not involve moving the body (Lillard, Citation2017, Chapter 2; Shapiro & Stolz, Citation2019). For example, children who trace letters while learning their sounds learn them better than children who only look at the letters (Bara, Gentaz, & Colé, Citation2007). Montessori children trace sandpaper letters while learning the associated sounds. In addition, Montessori teaches the sounds, not the names of the letters, and this “phonics” approach also leads to better reading (Rayner, Foorman, Perfetti, Pesetsky, & Seidenberg, Citation2001). Reading ability at later ages is strongly predicted by earlier reading ability (Cunningham & Stanovich, Citation1997). This early alignment of the Montessori program with practices that result in better reading could explain the greater proficiency found both for every comparison in this study and in lottery control studies as well.

Math test results

Although Black children were more proficient in math in both grades (as a trend but with a larger effect size at 8th grade), and the achievement gap in math for economically disadvantaged children was smaller in both grades, Montessori fared worse overall in 3rd grade and similarly in 8th grade overall on math tests. This aligns with the lottery control studies, which showed variable math performance. This might be surprising in that Montessori has an array of manipulatives – hands-on learning materials like the Trinomial Cube (a set of wooden blocks that illustrate the logic of the trinomial equation) – that empirical research supports (Laski, Jor’dan, Daoust, & Murray, Citation2015). It is possible that the manipulatives themselves create problems for working out math using only paper and pencil; Montessori children might depend on manipulatives to carry out math problems, and therefore do poorly on standardized tests where manipulatives are not available. This might be particularly the case at younger ages as children in Montessori classrooms are said to use the materials less after around age 9 as they become more apt at abstraction. It has also been suggested that because the manipulatives engender true understanding, Montessori students excel on deeper conceptual math problems but not on more superficial problems that appear on state standardized tests (Basargekar & Lillard, Citation2021; Mix, Smith, Stockton, Cheng, & Barterian, Citation2017). Whether younger Montessori children would perform as well as or better than traditionally schooled children on conceptual questions or if the materials were available during standardized tests are interesting research questions. Excepting overall in 3rd grade, Montessori schools never performed worse in math, and at 8th grade they performed slightly better (p = .10, d = .17).

Black children and economically disadvantaged children

These two subgroups often perform less well on standardized tests but showed a pattern of higher levels of proficiency at Montessori schools. If Montessori is causing that difference, a possible reason is that individualized teaching in Montessori classrooms might leave children less vulnerable to unconscious teacher bias as discussed in the Introduction. In conventional schools, children are frequently tracked into higher and lower performing groups and then remain in those tracks. In Montessori, there are no tracks; for the most part, all children are expected to master most of the materials in a classroom in the 3 years they are there. Alternatively, Montessori has been described as a culturally responsive pedagogy that might be especially helpful for students historically affected by the racial opportunity gap (Brunold-Conesa, Citation2019; Debs & Brown, Citation2017; Lillard, Taggart, Yonas, & Seale, Citationin press). Regardless, the results address an issue that some parents, especially parents of color, raise, namely a belief that Montessori lacks strong academics (Debs, Citation2019). Children with opportunity gaps, whose scores tend to be lower in most schools, do as well or better in Montessori schools than in their districts at large, in both math and ELA.

Limitations

We mention above the limitations of self-selection as a possible explanation of our results, although the fact that the ELA findings are mirrored in lottery control studies and the fact that children overall did somewhat worse on math in 3rd grade but similarly by 8th grade suggest program effects. Another possibility is that some other feature of Montessori schools besides the curriculum – possibly smaller school or class sizes, or teachers who get extra training – is responsible for the effects. However, school size research at the high school level demonstrates that schools with 600–900 students have the highest academic achievement as compared to smaller and larger schools (Lee & Smith, Citation1997). Smaller class sizes are associated with favorable effects for teachers and students alike, such as increased participation and more favorable attitudes toward students (Smith & Glass, Citation1980), yet they are inconsistently associated with higher test scores (Ehrenberg, Brewer, Gamoran, & Willms, Citation2001). Additionally, increased teacher training is not associated with greater standardized test achievement. Even when this training comes in the form of an advanced degree, improved student performance is only seen in middle school math, not in ELA and not in elementary or high school (Harris & Sass, Citation2011). Another limitation of the study is that it included data from only 10 states/regions. It is possible that states with more Montessori implement the program differently than states with less Montessori, but it is difficult to say how it would differ. Non-Montessori schools also differ by state and district in their outcomes. It may be the case that poorly performing districts are more likely to adopt the Montessori model in a few schools, and the model might not fare as comparatively well in other districts. Similarly, we cannot extend our findings to private school contexts as private schools are not federally required to test their students, and funding and governance differ in private schools as compared to public schools.

Ideally, a longitudinal lottery control study could follow a cohort of children in Montessori preschool through the end of high school. Examining test performance and growth over time for children whose families intended them to go to Montessori (but the “lottery losers” did not get in) would provide a clearer picture of academic achievement of Montessori students as compared to their peers. If the ideal study is infeasible, getting individual scores of students matched by race and income, and controlling for each year’s prior score while examining growth (as in Culclasure et al., Citation2018), can also provide more insight as to how Montessori affects students of many backgrounds.

In sum, a higher percentage of students in all public Montessori schools across 10 states/regions was proficient on state tests compared to their districts, and achievement gaps were smaller in Montessori schools as compared to their districts. This adds to the literature by showing that across many states, children in Montessori performed reliably better on ELA tests. Furthermore, Black children did reliably better on math tests as well, and achievement gaps were generally smaller; for economically disadvantaged children these gaps were smaller on both tests at both ages. These results are novel and important. Montessori is the most common progressive pedagogy today (Debs, Citation2019) so a better understanding of its outcomes is crucial. However, our design does not allow for causal conclusions. The next step is to obtain individual student scores from multiple districts, to allow for fine-grained analyses of change over time, and if significant differences are found, then to seek to conduct longitudinal lottery control studies, and explore more deeply possible reasons for the effects.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

During this project authors were supported by various sources: XT by NSF SES-1951038; AL bythe Wildflower Foundation; Wend II, Inc.; Institute for Education Sciences [R305A180181] to the American Institutes for Research. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education

Notes

1. Our search revealed no public Montessori schools in Alabama, Hawaii, Maine, North Dakota, South Dakota, Rhode Island, Vermont, or West Virginia.

2. HighScope is the play-based preschool program first developed for the famous Perry Preschool project (Schweinhart et al., Citation2005). See http://highscope.org.

References

Appendix

ANCOVA model diagnostics

The model diagnostics below were based on observed data when Bayesian methods were not used. When Bayesian methods were used, missing data were imputed using multiple imputation techniques. Since we assume that the missing data are missing at random, the assumptions should still hold with the Bayesian modeling approach.

Figure A1. Third Grade Reading: Residuals X Order of Observation (A), Residuals XProficiency (B), and Residuals X District (0.0) or Montessori (1.0)(C).

Figure A1. Third Grade Reading: Residuals X Order of Observation (A), Residuals XProficiency (B), and Residuals X District (0.0) or Montessori (1.0)(C).

Figure A2 Third Grade Math: Residuals X Order of Observation (A), Residuals XProficiency (B), and Residuals X District (0.0) or Montessori (1.0)(C).

Figure A2 Third Grade Math: Residuals X Order of Observation (A), Residuals XProficiency (B), and Residuals X District (0.0) or Montessori (1.0)(C).

Independence

Since the district scores do not contain Montessori school scores (as they were removed) and different districts and schools should be independent from each other, it is reasonable to believe the independence assumption is satisfied. In addition, we plotted residuals against the order of observation, and any variables used in the model. (A pattern that is not random suggests a lack of independence.) The plots below do not show any pattern, indicating that the independence assumption is satisfied.

Figure A3 Reading: Plot of Normality of the Model Residuals.

Figure A3 Reading: Plot of Normality of the Model Residuals.

Figure A4 Math: Plot of Normality of the Model Residuals.

Figure A4 Math: Plot of Normality of the Model Residuals.

Figure A5 Homoscedasticity of Reading Scores: Plot of Square Root of Standardize Residuals vs. Fitted Values.

Figure A5 Homoscedasticity of Reading Scores: Plot of Square Root of Standardize Residuals vs. Fitted Values.

Normality

Below we show the plots and test the normality of the model residuals.

Figure A6 Homoscedasticity of Math Scores: Plot of Square Root of Standardize Residuals vs. Fitted Values.

Figure A6 Homoscedasticity of Math Scores: Plot of Square Root of Standardize Residuals vs. Fitted Values.

Figure A7 Linearity and Reading Scores: Residuals by Fitted Values.

Figure A7 Linearity and Reading Scores: Residuals by Fitted Values.

Shapiro–Wilk normality test: p = .001 ≥ nonnormal

D’Agostino skewness test: skewness = −0.269, p = .162 ≥ not skewed, symmetric

Anscombe-Glynn kurtosis test: kurtosis = 4.516, p = .005 ≥ heavy-tailed

As pointed out in the literature (e.g., Maxwell, Delaney & Kelly, Citation2018), ANCOVA is robust against the violation of the normality assumption especially for symmetric data. Since the residuals are symmetric, we can still use ANCOVA.

Figure A8 Linearity and Math Scores: Residuals by Fitted Values.

Figure A8 Linearity and Math Scores: Residuals by Fitted Values.

Shapiro-Wilke normality test: p = .297 ≥ normal

D’Agostino skewness test: skewness = 0.259, p = .181 ≥ not skewed

Anscombe-Glynn kurtosis test: kurtosis = 2.817, p = .797 ≥ not heavy-tailed

The normality assumption is satisfied.

Homogeneity of variance

The error variance of the model should be constant across the values of the independent variables. We check the homoscedasticity by plotting the square root of standardized residuals versus fitted values. There should be no clear pattern in the distribution. From the plots below, we conclude that the homogeneity of variance assumption is satisfied.

Linearity

Fitted values and residuals should show no fitted pattern. As we can see in the two plots below, the linearity assumption is satisfied for models fitted to reading and math scores, respectively.

Note that although the above illustration is for the overall scores, model diagnostics for specific subgroups (e.g., white, black, etc.) are similar.

Also note that ANCOVA is robust against the violation of normality and homogeneity of variance assumptions, especially when data are symmetric and balanced. Although our data were not balanced with missing data, when Bayesian methods were applied, multiple imputations were automatically implemented, so the data became balanced.