5,068
Views
0
CrossRef citations to date
0
Altmetric
Intervention, Evaluation, and Policy Studies

Integrating Literacy and Science Instruction in Kindergarten: Results From the Efficacy Study of Zoology One

, , & ORCID Icon
Pages 1-27 | Received 18 May 2020, Accepted 18 May 2021, Published online: 21 Jul 2021

Abstract

This study examines the efficacy, cost, and implementation of an integrated science and literacy curriculum for kindergarten. The study was conducted in a large urban district and included 1,589 students in 71 classrooms in 21 schools. The research includes a multi-site cluster-randomized controlled trial and mixed-methods cost and implementation studies. Analysis revealed significant impacts on comprehension, letter-naming fluency, and motivation to read. No main impacts were observed on decoding, word identification, or writing; however, exploratory analysis revealed that students whose teachers implemented the treatment with fidelity performed statistically significantly better in writing and decoding. The cost to produce the observed effects was estimated at $480 per student, two-thirds of which was borne by the school. Despite this cost, treatment classrooms achieved savings by using an average of three fewer instructional programs than control classrooms. Teachers reported positive effects from the integrated curriculum on student engagement, learning, and behavior.

Introduction

Decades of research demonstrate that students who struggle with reading in the earliest grades often suffer long-term academic consequences (Cunningham & Stanovich, Citation1997; Ozernov‐Palchik et al., Citation2017; Stanley et al., Citation2018). A primary goal of early education must therefore be to effectively and efficiently help young children become readers. This study focuses on an innovative approach to improving literacy learning in kindergarten: the integration of literacy and science instruction. More specifically, the study investigates the hypothesis that the effects of evidence-based literacy instruction on young students’ learning are magnified by the infusion of science content.

The study’s setting is a large, economically challenged urban district. Following decades of troublingly low literacy levels among its students, the School District of Philadelphia (SDP) has made early literacy a priority. This study offers a comparison of business-as-usual literacy instruction in SDP schools and a curriculum that includes science integration. The research described here—which includes a multi-site, longitudinal randomized controlled trial (RCT) designed to produce impact, cost and implementation findings—supports causal inference for intervention impacts, in a context with urgent need.

Our work is informed by a growing body of consonant research on early science exposure, early literacy instruction, integrated curricula, and the role of motivation in learning to read. The need for early science instruction is increasingly recognized as shortages of entrants to science, technology, engineering, and math (STEM) careers grow pronounced (Stine, Citation2009). The U.S. lags behind many other nations in numbers of citizens earning degrees in STEM fields (Okahana et al., Citation2016), and this gap begins early: Large-scale assessments reveal that U.S. students consistently score lower in science than students in other advanced countries beginning in elementary school (National Center for Educational Statistics, Citation2019). Additionally, despite indications that young children have natural scientific proclivity (Clements & Sarama, Citation2016), motivation for science learning tends to diminish as students get older (Anderman & Young, Citation1994; Vedder-Weiss & Fortus, Citation2011). Shortages in STEM fields are particularly stark among women and minorities, and gender and racial gaps emerge in science achievement as early as fourth grade (National Science Foundation, Citation2019). Other research indicates that science knowledge increases with the volume of instructional time devoted to the subject (Curran & Kitchin, Citation2019), and that resource disparities disproportionately affect minority students, further contributing to science achievement gaps (Curran & Kellogg, Citation2016). Indeed, many female and minority students have already ruled out STEM careers by the end of elementary school (Wendt et al., Citation2018).

These trends suggest that science exposure is critical for all children, and particularly those unlikely to pursue careers in STEM. Furthermore, exposure must begin early; knowledge gaps evident in kindergarten contribute to science achievement disparities in subsequent grades (Morgan et al., Citation2016). Early science exposure establishes foundational scientific concepts students can build on later and develops children’s motivation for science learning (Henrichs & Leseman, Citation2014; Sahin et al., Citation2014). However, the opportunity to provide this early exposure is often missed: One study found that kindergarteners receive an average of only 2.3 min of science instruction per day (Wright & Neuman, Citation2014), while another found that average instructional time in science for kindergarten through third-grade classrooms was 19 min per day (Banilower et al., Citation2013).

Integrated Literacy and Science Curricula

The benefits of literacy and science integration are substantiated by two decades of research focused on older students (Cervetti et al., Citation2012; Goldschmidt & Jung, Citation2011; Guthrie et al., Citation1999; Guthrie & Humenick, Citation2004; Pearson et al., Citation2010; Romance & Vitale, Citation2001; Slavin et al., Citation2014; Shanahan et al., Citation2010; Wigfield et al., Citation2008). Curricular integration is efficacious, in part, because the parallel cognitive skills required by literacy and science allow learning in one domain to support the other. Cervetti et al. (Citation2006) note that “science and literacy are more than supportive and synergistic, they are in fact isomorphic” (p. 9), and that the strategies required for reading comprehension parallel the inquiry strategies science demands (Baker, Citation1991; Cervetti et al., Citation2006; Cervetti et al., Citation2012; Padilla et al., Citation1991).

In a quasi-experiment, Romance and Vitale (Citation2001) examined the impacts of the literacy and science curriculum Science IDEAS on reading and science learning of 540 third- through fifth-grade students, and observed significant positive effects on achievement in reading and science (Romance & Vitale, Citation1992, Citation2001). Another quasi-experimental study examined the impact of Concept-Oriented Reading Instruction (CORI), an integrated literacy and science program, on fifth-grade students’ achievement (Guthrie & Wigfield, Citation2009). This study revealed statistically significant positive effects on reading comprehension (ES = .59) and science content knowledge (ES = 1.59), and positive though not statistically significant impacts in several other domains. A 2011 RCT study examined the impacts of Science IDEAS on first and second graders’ achievement in science and reading (Vitale & Romance, Citation2011), again finding significant positive effects in both domains. Additional advantages may include the development of background knowledge and vocabulary, the activation of prior knowledge, and the cultivation of curiosity and motivation to read (Duke et al., Citation2011)—all of which predict long-term achievement (French, Citation2004; Grissmer et al., Citation2010; Strickland & Riley-Ayers, Citation2006).

Despite the promise of integrated literacy and science curricula, no prior study has rigorously examined the impacts of this approach in kindergarten. In a small quasi-experimental study, Wright and Gotwals (Citation2017) examined children’s (n = 147) oral language outcomes after receiving four weeks of an integrated science and disciplinary language and literacy curriculum called SOLID Start. Children who received the curriculum outperformed children in the control group in their use of vocabulary in a science context, knowledge of receptive science vocabulary, and their ability to make claims and give evidence-based supports. Another recent study (n = 120) investigated the impacts of LINKS, an integrated science and literacy curriculum implemented in kindergarten for ten weeks (Kurz, Citation2018). While this study found a significant impact on the treatment group’s understanding of science and depth of science knowledge, it is limited by its short implementation period and small sample size and does not examine impacts on literacy.

Effective Literacy Instruction

To fully assess the promise of integrated curricula, it is critical to select an intervention that incorporates evidence-based best practices in literacy instruction. Research has produced clear insights about which instructional practices benefit early readers most, and these insights are reflected in the program model for the Zoology One curriculum. Program elements include code-focused instruction emphasizing alphabet knowledge, phonics, and phonological awareness. Code-focused instruction is based in theory that asserts that understanding of the alphabetic principle—the recognition that sounds are represented by letters, which in turn comprise words—is a critical early step in the development of reading fluency and comprehension (Ehri, Citation1991, Citation2005; Juel, Citation1991; Stanovich, Citation1986). Significant evidence supports the effectiveness of code-focused reading instruction for beginning readers. The Report of the National Early Literacy Panel (NELP, Citation2009) presents a meta-analysis of 83 experimental or quasi-experimental studies with treatment-control equivalence at baseline. This analysis found that “code-focused interventions usually had moderate to large effects both on measures of conventional literacy (i.e., reading, spelling) and measures of precursor literacy skills (e.g., phonological awareness, alphabet knowledge)” (Lonigan et al., Citation2008, p. 109).

Evidence further suggests that high-volume print exposure yields important benefits for beginning readers (Jorm & Share, Citation1983; Share, Citation1995). This includes both complex-text exposure via teacher read-alouds or shared reading, and teacher-supported independent reading practice in leveled, high-interest texts (Duke, Citation2000; Miller & Moss, Citation2013; Reutzel et al., Citation2008; Topping et al., Citation2007). Correlational research spanning decades associates reading practice with long-term reading proficiency (Allington, Citation1977; Cipielewski & Stanovich, Citation1992; Donahue et al., Citation2001; Garan & DeVoogd, Citation2008; Samuels & Wu, Citation2003). In their meta-analysis of studies on reading volume, Mol and Bus (Citation2011) found that 12% of language proficiency in preschool and kindergarten was explained by print exposure alone, and that this effect was accretive: The explanatory power of print exposure on reading achievement increases as students age, suggesting that the benefits of early, high-volume reading are exponential.

While research on early writing instruction is less abundant, meta-analyses highlight impactful and promising practices. These practices include explicit instruction in the process and mechanics of writing and routines and structures that help make writing a pleasant and familiar experience (Graham et al., Citation2015). Research also supports the close integration of reading and writing instruction, so that skills taught in one domain are reinforced through exposure and practice in the other (Graham & Hebert, Citation2011). Additionally, frequency of writing is understood to be important for developing writers (Graham et al., Citation2015).

Motivation to Read

The current study explores the impact of integrated science and literacy instruction on student motivation to read. Increased motivation is identified as an important potential factor because interest-driven motivation to read is “the link between frequent reading and reading achievement” (Guthrie & Wigfield, Citation2000, p. 405). Motivation is acknowledged as a key factor in the development of the reading habits that support long-term achievement in literacy (McGeown et al., Citation2015; Wigfield et al., Citation2016). Furthermore, motivation is a significant predictor of reading comprehension both in young and older students (Jean et al., Citation2018; Taboada et al., Citation2009; Wang & Guthrie, Citation2004). A rigorous quasi-experimental study yielded impacts as high as .71 SD on comprehension from gains in motivation (Guthrie et al., Citation2006).

Research Questions

Here, we comprehensively examine the impacts, cost, and implementation of an integrated science and literacy curriculum for kindergarten. We also explore heterogeneity of impacts based on student characteristics and teacher implementation fidelity. The research questions we address are:

Impact

  1. Do students in kindergarten classrooms using an integrated science and literacy curriculum outperform students in business-as-usual control classrooms in:

    1. decoding and comprehension, as measured by the Woodcock Reading Mastery Test, 3rd Edition (WRMT)?

    2. reading and letter naming fluency, as measured by the Developmental Reading Assessment (DRA) and AIMSWeb curriculum-based assessment, respectively?

    3. writing, as measured by the Kaufman Test of Educational Achievement (KTEA)?

    4. science, as measured by a researcher-developed science assessment?

    5. motivation to read, as measured by Kindergarten Reading Motivation Scale (KRMS)?

Heterogeneity

  • 2a. Do treatment effects vary among subgroups of students based on gender, language and home language status, IEP status, or eligibility for lunch assistance?

  • 2b. Do outcomes for students in treatment classrooms vary based on teachers’ fidelity of implementation?

Cost

  • 3. What is the cost of Zoology One relative to business as usual literacy instruction?

Implementation

  • 4. How was Zoology One implemented by teachers in the treatment group and what factors contributed to variations in fidelity?

Intervention and Context

The curriculum that is the focus of this evaluation is Zoology One: Kindergarten Research Labs, developed by American Reading Company (Citation2019) (the program was later renamed ARC Core Kindergarten). Zoology One was selected for study because it is a widely used example of an integrated curriculum for young children, because its literacy and science content are both standards-aligned, and because it employs the evidence-based literacy practices discussed earlier.

Zoology One is a full-year curriculum centered around a daily 120-min integrated literacy and science instructional block. The program includes four 9-week units, implemented in succession. The first unit is introductory, designed to orient students to the basics of books and literacy and to build key classroom procedures. Following this introduction, the curriculum proceeds through a Zoology unit, an Ecology unit, and an Entomology unit. Teachers receive a new set of topically aligned instructional materials and texts to use with each 9-week unit, but the structures and practices that guide instructional delivery are consistent throughout the year.

Zoology One uses a balanced literacy framework that incorporates each of the following in every daily instructional block: direct instruction in reading, writing, and science; complex text exposure delivered via multiple daily, themed teacher read-alouds; high-volume print exposure via supported independent reading in themed, leveled texts; formative assessment and progress monitoring implemented by the teacher during individual conferences or small-group instruction; high-volume writing practice related to the science theme; and science inquiry, including hands-on science activities and drama, music and art activities oriented around the science themes. Zoology One also includes a focus on parental involvement; students are expected to build the stamina to read for 30 min in class, and 30 min at home each day.

Teachers implementing Zoology One as part of this study received a full day of startup training at the beginning of the school year plus 10 visits throughout the year—approximately one visit per month—from coaches employed by the curriculum developer. While schools implementing this program can purchase varying amounts of coaching, 10 coaching visits are recommended by the developer. Two coaches provided all of the coaching to treatment teachers in this study. During their visits to implementing classrooms, the coaches provided a range of supports, including modeling components of Zoology One instruction such as whole-group instruction or skill-based small-group intervention; confirmation of student reading levels as determined by teachers; side-by-side coaching during one-to-one conferencing; and use of the program’s formative assessment tools. Along with these supports, which were provided to all teachers, some teachers participated in an optional half-day introduction to the first science-themed unit, and some received additional support via phone or email from the coaches in between classroom visits.

The setting for the study is the School District of Philadelphia (SDP). Philadelphia is a city of approximately 1.5 million people (U.S. Census Bureau, Citation2019) and the poorest of the nation’s large cities; approximately one-quarter of residents live in poverty (Hunger Free America, Citation2018). Despite a large charter- and private-school sector that serves nearly 40% of school-age children, approximately 134,000 students attend regular District-managed public schools. The demographics of the student body of SDP deviate from those of the city overall: For example, although over one-third of Philadelphia residents identify as non-Hispanic White, only 15% of SDP students do so; and although 44% of Philadelphia residents identify as African-American, more than 80% of SDP students are African-American. Median income of parents of children in public school is $3,000 below average for residents, and only 17% of parents of public school students have a bachelor’s degree or higher (NCES, Citation2019), as compared with 27% of residents (U.S. Census Bureau, Citation2019). Philadelphia public schools serve large populations of English Language learners (10% of students) and students with Individualized Education Plans (18% of students) (NCES, Citation2019).

At the time of this study, the School District of Philadelphia had invested significantly in early literacy. Kindergarten teachers across the district were trained and supported in using a balanced literacy approach whose basic components parallel those of Zoology One.

Method

Evaluation Design

This study encompasses a multi-site cluster-randomized controlled trial (RCT) with embedded cost and implementation research. The impacts of Zoology One relative to business-as-usual instruction were estimated via the RCT, in which we randomly assigned entire kindergarten classrooms—including the teacher and all students— to conditions, within schools. Participating schools ranged in size, with the smallest having two kindergarten classrooms and the largest having nine. Classrooms assigned to treatment were expected to implement Zoology One in place of regular literacy instruction for 120 min per day, for the full school year. Classrooms assigned to control were expected to implement SDP’s business-as-usual literacy program for 120 min per day. Teachers in the treatment condition were asked not to provide any science instruction over and above that provided via the Zoology One curriculum. Teachers in the control condition were asked to provide the same science instruction they normally would.

Within the RCT framework, we assessed the costs of the treatment condition relative to the business-as-usual control condition using the ingredients method (Levin et al., Citation2018). A mixed-methods approach was used to understand treatment teachers’ implementation of the Zoology One program and the treatment/control contrast. Cost and implementation activities were coordinated to increase efficiency and provide a holistic understanding of both topics.

Participants

Participants in the evaluation of Zoology One included 71 kindergarten teachers in 21 schools and their students (n = 1,589). None of the teachers in the study had previously implemented the Zoology One program. The RCT was implemented in two cohorts of schools. Cohort 1 included 12 schools (with 40 kindergarten classrooms) during the 2016–2017 school year. Cohort 2 included 9 additional schools (with 31 kindergarten classrooms) during the 2017–2018 school year. At the start of each study year, the research team randomly assigned classrooms to treatment and control, within school. We conducted analyses using both an intent-to-treat (ITT) analytic sample and a treatment-on-treated (TOT) analytic sample. summarizes the size of ITT and TOT samples by group and cohort.

Table 1. ITT and TOT student samples and attrition.

The ITT sample included students who were rostered prior to random assignment of classrooms to treatment condition, and who were assessed in the fall and spring of Kindergarten. The TOT sample includes all students in the ITT sample and any other student with complete pre and post assessment data. We identified students who enrolled or were rostered after assignment as early joiners. presents baseline demographic attributes and fall assessment data, all of which were equivalent in both ITT and TOT samples.

Table 2. Baseline comparisons by sample and subgroup.

We tested for differences between study groups at baseline using a similar model specification to the one used for estimating impacts (see Analysis section). This analysis revealed no significant differences between groups at baseline based on scores on the Reading Readiness cluster of the WRMT, t(772) = 0.09, p = 0.925; DRA scores, t(746) = 0.68, p = 0.495; ELL status, t(666) = −0.50, p = 0.620; Female, t(730) = 0.82, p = 0.414; Free from Tape (FFT) status (a free/reduced-price lunch indicator) t(730) = 1.30, p = 0.195; IEP status, t(666) = 1.73, p = 0.085; and Non-English home language status, t(730) = −1.28, p = 0.202.

Student Outcome and Teacher Implementation Measures

We administered the Reading Readiness cluster of the WRMT to each cohort in the fall as a baseline assessment, and administered the Passage Comprehension, Word Attack, and Word Identification subtests in the spring as measures of reading comprehension and decoding. The WRMT was individually administered by trained, monitored assessors. To further investigate impacts on reading, we obtained secondary data from SDP’s district-wide literacy assessments. SDP classroom teachers in both treatment and control groups collected data in fall and spring each year using DRA and AIMSWeb, both widely used classroom-administered measures of reading achievement. The AIMSWeb probe assesses letter naming fluency, a known predictor of future reading achievement (Leppänen et al., Citation2008; Stage et al., Citation2001). The research team also administered the KTEA-3, an individually administered assessment, for the analysis of impacts on writing.

Although the intervention’s theory of change does not posit impacts on math from Zoology One, we conducted a math assessment in Cohort 1 to examine the possibility that the expanded focus on literacy and science in the treatment classrooms might negatively impact math achievement. The measure for the Math outcome was the Kaufman Test of Educational Achievement (KTEA-3), which we administered to a random sample of 359 Cohort 1 students.

At the time of this study, there were few appropriate science assessments for kindergarten, and we were not able to identify an existing assessment that was comprehensive, accessible to pre-readers, and feasible to administer. As a result, we assessed science outcomes for this study using an instrument designed by the research team, in collaboration with advisors with expertise in both science and assessment development. In accordance with the Standards for Educational and Psychological Testing (2014), we developed, piloted, and selected items for this assessment over multiple rounds. The final instrument included 21 multiple choice items spanning all Next Generation Science Standards (NGSS) in Life Science from kindergarten through fifth grade, with 2–5 items for each standard. Along with items designed to assess life sciences content knowledge, the assessment included items that cover the science and engineering processes outlined in NGSS (such as using diagrams and graphs). Trained assessors administered the assessment individually to kindergarten students via Qualtrics using a touch screen and pictures. In order to eliminate confounds with students’ English language and/or literacy proficiency, the assessment did not require students to read or speak.

The theory of change guiding the study posits that Zoology One’s engaging, animal-themed texts will result in increased motivation to read. To assess this hypothesis, researchers designed and validated a new measure, the Kindergarten Reading Motivation Scale (KRMS). Trained assessors individually administered the KRMS to 878 treatment and control students in the spring of 2017 as a measure of motivation to read. The measure includes 19 items probing students’ feelings about reading (e.g., “Do you like to read?” “Can you learn new things from books?” “Do you like to look at books by yourself?”).

We collected data from all classrooms in our sample to examine how treatment classrooms delivered the curriculum in terms of resource use and implementation fidelity and to identify the contrast between treatment instruction and business-as usual-instruction. Data sources included teacher surveys, daily activity logs, interviews, and school district documents outlining the scope and sequence for literacy and required instructional activities. We administered online surveys to teachers in the treatment and control conditions in the spring. Thirty-six of 37 treatment teachers completed the survey and 32 of 34 control teachers. Teachers responded on a range of topics, including their comfort teaching science, the materials and curricula they used for instruction, the quantity and perceived quality of the coaching they received, and any supports or interventions provided to students in addition to the regular curriculum. To understand differences in science dosage, we also asked control teachers about the quantity of science instruction their students received. Treatment teachers were asked how much, if any, science instruction was provided over and above Zoology One. We also collected data about teachers’ allocation of planning and instructional time across content areas using a daily activity log. The logs asked teachers to record their activities throughout the day in 30-min increments on three randomly selected school days from late fall to early spring. We used teacher logs to explore implementation fidelity and to examine contrasts between treatment and control in teacher time and resource use. In addition, we invited teachers to participate in interviews in order to expand on the same topics from the survey and logs. Forty-nine teachers participated in the interviews, (28 treatment and 21 control). The interview protocol was based on the program’s theory of change, theory regarding instructional program implementation, and theory regarding teachers’ implementation decisions.

Analyses

Impact (Research Question 1)

The impact of the Zoology One program after one school year of the intervention is based on an ITT analysis, where treatment status is determined at the time of randomization. Students who enrolled in a study school after random assignment to conditions were excluded from ITT analysis and treated as joiners for exploratory analysis of TOT impacts. A multilevel analysis of covariance was performed to estimate single-year treatment effects for student i, in classroom j, and school k, for this multi-site cluster randomized trial. The modeling approach allows for variation in treatment effects across schools (Raudenbush & Bryk, Citation2002). We treat classrooms and schools as random effects with school as the site block, producing four sources of variability: within-classroom variance (i.e., Level-1 student residual); between-classroom, within-school variance (i.e., Level-2 classroom intercept); between-school variance (i.e., Level-3 school intercept); and between-school variance in program effects (i.e., Level-3 school treatment effects). To estimate the average treatment effect across sites, a mixed-effect model was used in which the outcome of interest is a function of a fixed student effect associated with fall pretest scores (Level 1), and a fixed classroom effect associated with treatment status (Level 2). The three-level model used for estimating all impacts is presented in EquationEquations (1)–(3). (1) Yijk= π0jk+β1jkXijk +eijk eijkN(0,σ2)(1) (2) π0jk=β00k+β02kTjk+r0jk r0jk N(0,τπ)(2) (3) β00k=γ000+u00k var(u00k) τβ00(3) β02k=γ010+u01k var(u01k) τβ01

EquationEquation (1) is the person-level model, where

  • π0jk is the mean for classroom j in school k;

  • β1jk is the pretest effect for student i in classroom j in school k;

  • Xijk is the individual Reading Readiness cluster score at baseline;

  • eijk is the error associated with each student; and

  • σ2 is the within-school variance.

EquationEquation (2) is the classroom-level model, where

  • β00k is the mean for school k;

  • β02k is the treatment effect at school k;

  • Tjk is a treatment contrast indicator;

  • r0jk is the random effect associated with each classroom; and

  • τπ is the variance between classrooms within school.

EquationEquation (3) is the school-level model, where

  • γ000 is the grand mean;

  • γ010 is the average treatment effect;

  • u00k is the random effect associated with each school;

  • u01k is the random effect associated with each school treatment effect;

  • τβ00 is the variance between school means; and

  • τβ01 is the variance between schools on treatment effects.

In this study, γ000 indicates the average scores for students in the control group, and γ010 is the main effect of treatment, both adjusted for student pretest scores. The error terms u00k,u01k allow these average effects to differ by school.

Heterogeneity and Fidelity (Research Question 2)

To explore heterogeneity of effects, we estimated treatment impacts separately for subgroups based on gender, English Language Learner (ELL) status, participation in district feeding program (FFT), IEP status, and home language (Research Question 2a). Finally, we computed fidelity scores for each teacher, grouped the teachers into quartiles by fidelity score, and tested for differences between the students of high-fidelity teachers and those of low-fidelity teachers. This test is exploratory in nature (Research Question 2b).

Cost Analysis (Research Question 3)

We applied the ingredients method in a cost-effectiveness framework to estimate the cost of Zoology One relative to business-as-usual (Levin et al., Citation2018). We used the program’s design, theory of change, and theoretical treatment contrast to design the cost study so that the cost estimate would capture the achieved relative strength of the program in resource terms (Hulleman & Cordray, Citation2009; Weiss et al., Citation2014). We measured the resources (ingredients) allocated for literacy instruction in treatment and control classrooms to estimate the cost of all ingredients used to produce the impacts we observed. Following cost-effectiveness standards, it is important to note that we apply the economic definition of costs and estimate the cost of all resources used, regardless of who financed them. Accordingly, we describe the ingredients used, illustrate how the program changed literacy instruction for the treatment group in practice, estimate the cost per student to achieve any observed change, and describe distribution of costs. We distinguish between total cost to produce effects—which includes important inputs like parent/caregiver time for home reading and changes in other literacy curricula used in the treatment classrooms—and the purchase price to buy the curriculum.

First, we outlined, described, and quantified the ingredients related to literacy instruction in Zoology One and control classrooms, focusing data collection on those resources that were most likely to differ across conditions and drive a change in student learning (Levin, Citation1975; Levin et al., Citation2018). We collected data on personnel, training, materials, data management/software, and facilities components of each condition. Because we randomly assigned classrooms within schools, both conditions equally used facilities, transportation, food, and other resources related to schooling, so these inputs are not included in our analyses.

Second, we matched ingredients with standard average national prices to reflect market rates relevant for an efficacy RCT designed to inform the field. We obtained price data from the Department of Labor, ARC, and publicly available market prices for other literacy programs used in classrooms. Prices were adjusted for inflation to reflect 2018 US Dollars and amortized, when appropriate, to reflect the portion of an ingredient used during the year. For example, each classroom received a kit containing over 400 books. Most of the books will last longer than one year. We used teacher-reported data on the proportion of books that were lost or destroyed coupled with data from the Zoology One program records to estimate the frequency of book replacement. For the supplemental curricula, we assumed a life of 5 years and a classroom size of 22.5 students. We tested the sensitivity of our amortization assumptions by varying the years of available life of each resource to ensure that our findings are robust to these decisions.

Third, we calculated the total cost of Zoology One above and beyond business-as-usual. We examined the cost distribution focusing on costs borne by schools and parents/caregivers. Below, we present costs per student to correspond to the effectiveness estimates.

Implementation (Research Question 4)

Research Question 4 was addressed through a combination of mixed-methods analysis of implementation fidelity and qualitative analysis of teachers’ explanations for how and why their implementation of Zoology One varied. We used data from the teacher surveys and daily activity logs to assess fidelity to the Zoology One program. The fidelity framework was developed at the outset of the study in collaboration with Zoology One’s developer. In the analysis, we measured the extent to which teachers used 13 core components of the Zoology One curriculum. These components are identified in the program’s logic model and pertain to the domains of resources and materials, training and coaching, and instruction.

For five of the 13 core components, we measured teachers’ implementation fidelity via items on the teacher survey that asked about the consistency with which they implemented activities (e.g., “Students select and exchange books daily,” “Students engage with science content daily via reading, writing, and hands-on science-themed activities”). We assigned one point if the teacher met the specified fidelity metric for the component (e.g., reported using the component “always or almost always”), and 0 points if not. For nine components, we used data from daily activity logs to assess implementation fidelity. For each time period when they were implementing Zoology One, teachers were asked to indicate the components that they used. For each day, one point was assigned when the teacher indicated implementing the component at any time throughout the day. We averaged points for each component across the days for which logs were completed. Finally, the total fidelity score was computed by averaging the points assigned for each of the 13 components, with total possible scores ranging from 0 to 1.

To understand the reasons for variation in teachers’ implementation of Zoology One, we analyzed data from interviews with twelve treatment teachers in Cohort 1 and 16 treatment teachers in Cohort 2, representing a total of 21 schools. We developed and used codes that emerged both inductively and deductively from the Zoology One logic model or interviews. We applied the codes to randomly selected transcripts, then discussed discrepancies to arrive at common understandings until 80% reliability was reached. To enhance the validity of our findings regarding factors contributing to variation in teachers’ implementation of Zoology One, the research team applied three analytic strategies to the data coded with the “implementation factors” main code. First, we counted how many transcripts included a particular sub-code of the “implementation factors” main code (e.g., “coach”) at least one time. This was a way to determine the prevalence of that sub-code within the overall sample. Next, we counted sub-codes within individual transcripts, and tallied the number of transcripts within which a given sub-code appeared most frequently. Last, we looked at which sub-codes emerged most often across all transcripts. After applying all three analytic methods to the data coded “implementation factors,” we ranked the sub-codes by frequency across all three methods to derive our key themes.

Findings

Investigation of all differences between treatment and control conditions is important for interpreting the results of experimental research. Both treatment and control classrooms used a balanced literacy instructional approach designed to engage students with the Kindergarten Common Core Standards for Reading, Writing, and Speaking and Listening across a range of instructional settings. In both conditions, students experienced a combination of direct instruction, teacher read-alouds, shared reading, small-group instruction, and independent practice in both reading and writing.

In terms of science, our expectation at the outset of the study was that students in the treatment group would receive only the science instruction embedded in Zoology One, and that students in the control group would receive little science instruction at all. We observed, however, that students in both groups received more direct science instruction than we anticipated, and that total minutes of daily, direct instruction in science were similar across the two groups. In the control group, 41% of teachers reported that their classes received science as an enrichment class, taught by a specialized teacher. Fifty-three percent of control-group teachers reported that they taught science to their own students, either in addition to or instead of the enrichment class. In all, control teachers reported that their students received, on average, 28 min of science instruction per day.

Treatment teachers were asked not to teach science outside the Zoology One block, and most (89%) reported that they complied with this request. However, treatment teachers reported that their classes received enrichment science at similar rate to control classes (36%). Combined with the explicit science instruction provided as part of Zoology One, we estimate that this resulted in comparable total minutes per day of explicit science instruction for both groups.

Furthermore, we found that the teachers themselves were similar, on average. We observed no differences between treatment and control teachers based on years teaching, years teaching kindergarten, and years teaching at current school. There were no significant differences between groups in the number of teachers who identified as certified reading specialists.

In terms of differences, we found that control teachers used more packaged curricular programs and interventions than Zoology One teachers. Most control teachers delivered direct instruction in phonics using a commercially available whole-class program, Saxon Phonics, while most treatment teachers delivered phonics instruction within Zoology One’s instructional components, and often in small-group or individual settings. Following the Zoology One curriculum, treatment teachers used a formative assessment framework to guide and target instruction during small-group work and individual conferences, and they used formative assessment data to select texts each day for read-alouds and shared reading activities. Teachers in the control group more often used basal readers and/or textbooks; instruction was paced in accordance with an established scope and sequence rather than students’ progress data. Two other key differences were noted: First, Zoology One emphasizes home reading and teachers are instructed to send books home with students each day to support this component. This was not a widely observed practice in the control classrooms. Second, while Zoology One embeds science instruction within the instructional components of the literacy block, the business-as-usual literacy program did not. Thus, in addition to explicit science instruction, Zoology One students also had up to 120 min per day of sustained immersion in science content through the program’s science-themed teacher read-alouds, student texts, and writing activities.

Research Question 1: Do Students in Kindergarten Classrooms Using an Integrated Science and Literacy Curriculum Outperform Students in Business-as-Usual Control Classrooms?

presents results of baseline equivalency tests on the WRMT Reading Readiness Cluster (used as pretest measure in impact models). The following four columns present impacts on decoding and comprehension via the Word Identification, Word Attack, Word Comprehension, and Passage Comprehension subtests of the WRMT. Students in the treatment group scored significantly higher on one outcome, Passage Comprehension (b = 1.90, t(771) = 2.11, p = 0.035). Passage Comprehension had an SD of 11.86 in the control group, producing an ITT Glass’s Delta effect size of 0.16. We find no differences between treatment groups on the Word Attack, Word Identification, or Word Comprehension WRMT subtests.

Table 3. ITT and TOT impact analysis results with baseline equivalence findings.

We also observed significant differences on two other outcomes, letter naming fluency and motivation to read. In the ITT sample, letter naming fluency (b = 8.35 t(459) = 2.02, p = 0.044) had an SD of 29.37 in the control group, producing an ITT Glass’s Delta effect size of 0.28. Analysis of data from the KRMS revealed that Zoology One students scored statistically significantly higher than control students in reading motivation (b = 0.11, t(716) = 4.58, p < 0.0001), producing an ITT Glass’s Delta effect size of 0.32 SD. We find no differences between the treatment and control groups on science, as measured by our researcher-developed assessment, on the general reading outcome measured by the DRA, and no differences on the writing outcome.

Research Question 2a: Do Literacy Treatment Effects Persist within Subgroups Based on Gender, Language and Home Language Status, IEP Status, or Lunch Assistance?

Exploratory analyses of treatment effects within salient student groups are presented as fully standardized effect sizes for ITT and TOT in . Results demonstrate that the overall treatment effects, both significant and non-significant, bore out in most subgroups.

Table 4. Standardized treatment effects for student samples and subsamples.

Standardized effects across all student subgroups suggest that the reported average treatment effect is robust and generalizable. Notably, estimated impacts for boys’ reading comprehension were somewhat larger than those for girls’; impacts for native English speakers were larger than those for English Language Learners; and impacts for students who do not qualify for free or reduced-price lunch were larger than those for students who do qualify.

Research Question 2b: Do Literacy Scores in Treatment Classrooms Vary Based on Teachers’ Fidelity of Implementation?

An exploratory analysis compared literacy impacts for students of high-fidelity implementers with those of low-fidelity implementers. This contrast revealed statistically significant differences between top-quartile and bottom-quartile implementers on WRMT Word Attack and Word Identification subtests and on the KTEA assessment of writing ().

Table 5. Effect of teacher fidelity on student mediators and outcomes.

These findings suggest that the intervention is effective on some outcomes even with lower fidelity, while other outcomes require faithful implementation. Because teachers were not randomly assigned to fidelity condition it is understood that teachers with high fidelity may be different from teachers with low fidelity in ways that would relate to student achievement.

Research Question 3: What Are the Relative Costs Associated with the Curriculum?

We observed that the Zoology One was largely implemented as designed, in terms of resources. We found that home reading was not fully achieved, with students averaging 75 min per week versus the recommended 150. Zoology One teachers, on average, reported using the curriculum for more minutes per day than required (170 vs. 120 min).

Zoology One is designed to be a comprehensive, balanced literacy curriculum, reducing the need for additional curricula to teach literacy. Thus, to estimate costs that correspond to effects, we also considered changes in other curricula that contributed to literacy development (Bowden et al., Citation2017). As noted earlier, we observed that teachers implementing Zoology One used fewer curricular programs in their classrooms overall. On average, the treatment classrooms used Zoology One and three additional curricular supports. Control classrooms used six curricula on average. Three programs prominently used in control classrooms but not in treatment classrooms were Lexia, Saxon Phonics, and ReadyGen. While estimating the total costs of these programs was outside the scope of this work, we use purchase prices as minimum values in our analysis to adjust for this reduction in other curricula. We find that the elimination of three programs in Zoology One classrooms creates important cost savings for schools, averaging around $40 in savings per student or $900 saved per classroom per year relative to the business as usual programming in control classrooms ().

Table 6. Differences in instructional program between treatment and control.

As shown in , the incremental cost of Zoology One is about $480 per student on average. This cost reflects the differences in resources received by students between experimental conditions, and thus the cost to produce effects. This means that the cost estimate can be combined with the effectiveness estimate in a cost-effectiveness ratio to compare the efficiency of this curriculum to other whole-class kindergarten literacy curricula. At the time of this study, there were no comparable curricula listed in the What Works Clearinghouse.

Table 7. Average incremental cost per student of Zoology One.

As stated above, the cost to produce effects is not equivalent to the purchase price. For example, when a school purchases coaching support, the school pays a fee for coaches from ARC to serve all teachers delivering the program in the school. In our study, the program was delivered to roughly half of the kindergarten students in each school in our sample. This means that the cost of coaching in our study is being divided by fewer students than a typical implementation. Also, we adjust the price of the coaching to reflect the investment in teaching capital, which lasts for longer than one year. For the purposes of our evaluation, the cost estimate must correspond to the effectiveness estimate to reflect the cost to produce the impact rather than an idiosyncratic purchase price, which does not reflect the true value of this investment.

When we examine the distribution of costs, we find that one-third of the total cost is driven by parent/caregiver time allocated through the home reading component. To estimate the portion of costs borne by the school, we sum the components of the curriculum that are typically purchased by schools (in the evaluation, these were funded by a research grant) and any resources purchased or reallocated by the school to deliver the curriculum. Because the treatment replaced existing curricula, there were very few additional or reallocated costs. We found that the cost to the school for implementing Zoology One is valued at approximately $320 per student. Based on this estimate, the school or district bears 67% of the total cost to deliver the program.

Research Question 4: How Was the Integrated Curriculum Implemented by Teachers in the Treatment Group and Why Did Teachers’ Implementation Vary?

Our analysis of fidelity of implementation revealed high levels of fidelity overall, with substantial variation. The total possible fidelity score ranged from 0 to 1, where 1 indicated that a teacher used all 13 components of Zoology One and 0 indicated that a teacher used 0 components. The mean teacher fidelity score was .74. There was substantial between-teacher variability in the total fidelity score (SD = .13; range [.42, 1]). The sample had a 25th percentile fidelity score of .62 and a 75th percentile fidelity score of .85.

There was also considerable variability in the mean fidelity score among the specific Zoology One components. Mean component scores ranged from .47 (students assigned independent reading at home; indicating that approximately half of the teachers in the sample implemented this component) to 1 (implementing the 4 units in succession; indicating that all teachers implemented this component).

Reasons Teachers Varied Implementation

Our analysis of qualitative data from interviews with 28 teachers in 21 schools revealed two primary factors impacting teachers’ implementation of Zoology One. These are (1) teachers’ assessments of their students’ needs and interests, and (2) constrained instructional time. Findings from our three parallel analyses indicate that teachers’ perceptions of their students’ interests and needs were the most influential factor driving variations in implementation. More specifically, twenty-one of the 28 teachers interviewed (75%) cited students’ needs and interests as a factor that influenced their implementation of Zoology One. Of these 21 teachers, 11 teachers (52%) also identified students’ needs and interests more frequently than any other factor.

When teachers talked about how their students’ needs and interests influenced their use of Zoology One, it was generally in reference to decisions to a) vary the time allotted to particular activities, or b) supplement the program with other materials or instruction. For example, we found that teachers frequently varied the time allocated for independent reading depending on student engagement, or eliminated components on particular days in order to extend Zoology One activities in which the students were engaged. Teachers also talked about supplementing Zoology One based on their students’ needs. Most often, these teachers reported incorporating additional phonics instruction, perhaps instead of other program elements.

“Time” was the second most frequently applied code in the data set; 21 of the 28 teachers referenced time constraints or scheduling issues as a primary reason why they deviated from the Zoology One curriculum. Teachers who identified time as a factor spoke about the unpredictability of their daily schedules and changes to their schedules based on factors outside their control. These interruptions and schedule changes might limit their ability to do all they had planned for a particular day—for example, they might have planned for two read-alouds, in accordance with the Zoology One lesson plan, but ended up having time for only one.

Another factor we identified in the interview data was the availability of support staff to help with various aspects of Zoology One (e.g., guided reading, independent reading, writing). Without extra staff to manage centers, supervise activities, or meet with children individually or in small groups, some teachers reported being unable to implement the whole curriculum on a given day. This theme appeared in 20 of 28 interviews, and seven teachers (35%) cited it more often than any other factor. Finally, of the 28 teachers we interviewed, 15 (54%) mentioned their personal beliefs when discussing decisions about program implementation and five (33%) mentioned beliefs more than any other factor. Teacher beliefs influenced implementation in that teachers place varying levels of priority on different literacy components—independent reading or direct phonics instruction, for example—and therefore implemented those components more or less consistently, or deviated from the Zoology One schedule in order to accommodate them.

Study Limitations

Several limitations to this research should be noted. First, we report impacts at the end of kindergarten only. Work is underway to examine participants’ literacy achievement in the years following this portion of the study. The findings of this longitudinal research will be critical to the interpretation and practical applicability of our findings. Similarly, our research into the impacts of Zoology One is limited by the fact that all teachers were in their first year of implementation of the program. Research indicates that teachers’ use of curricula improves over time (Ladd & Sorensen, Citation2017), suggesting that we might observe different results from teachers beyond their initial year using Zoology One.

Additionally, we were unable to identify a validated science assessment that could adequately capture Zoology One’s hypothesized impacts on science learning. As a result, our finding of no impacts on science comes with the caveat that our researcher-developed measure was not validated for purposes of measuring science achievement in kindergarten. The research team’s future work will assess impacts on this study’s participants’ 4th-grade state assessments in science.

We assessed writing using a standardized assessment, the KTEA, which we selected in large part for the feasibility of administration and scoring with our large sample of young children. However, this assessment focused on writing mechanics, and scores from our study indicate that kindergarten students in SDP—both treatment and control—significantly underperform the KTEA norming sample in this area, introducing the threat of floor effects. In our teacher interviews, many Zoology One teachers reported observing an increase in students’ ability to compose complete and focused sentences and paragraphs as compared with their prior classes of kindergarteners, as well as increased stamina for writing. The KTEA instrument was not designed to detect these changes. As a result, some impacts on student writing may have gone undetected.

Discussion

Despite these limitations, the findings presented here provide important insights and highlight promising directions for future research. Our causal impact findings in reading are mixed; we observed meaningful impacts from Zoology One on reading comprehension and letter-naming fluency, but treatment students performed no better or worse than control students in two other key areas of interest, writing and decoding. When juxtaposed with the resource and cost findings, however, the nuance of this statement is meaningful: Students in the treatment group performed no better and no worse in decoding than those in the control group, despite the latter group having participated in a daily, intensive program that teaches phonics in isolation in addition to their other literacy curriculum. Indeed, treatment students, whose teachers overall reported using 3 fewer programs and interventions on a daily basis, performed as well as or better in all areas of literacy achievement than peers in the control group.

Furthermore, our exploratory analysis revealed that students whose teachers implemented Zoology One with high fidelity performed statistically significantly better in writing and decoding, two areas where main impacts were not observed. Thus, we found either a significant treatment effect or significant group mean difference based on fidelity for every literacy outcome of interest. While our fidelity finding is not causal, it suggests that even a relatively low-fidelity implementation of Zoology One yields impacts in comprehension and letter-naming fluency and that high fidelity can yield additional impacts in decoding and writing.

Our exploratory analysis of impacts on science yielded null findings. This result surprised us given the immersive science focus of Zoology One. However, our conclusion that both groups received similar quantities of direct instruction in science most likely explains this finding. A question lingers as to why the daily immersion in science-themed reading and writing activities throughout the Zoology One literacy block did not appear to accelerate treatment students’ acquisition of science knowledge. This question warrants further exploration with a validated science assessment. Additionally, our ongoing investigation of longitudinal impacts on the science achievement of students in our study may yield further insights. Based on our current findings alone, however, it appears that Zoology One is best viewed as a literacy program, rather than a science intervention.

This study detected notable effects on students’ motivation to read. These effects speak to the overarching question of this study: What benefits are realized when early literacy instruction is combined with science? Here, our findings tell a story that assessment scores alone cannot. Particularly in a context like SDP’s, where many children enter kindergarten with limited pre-reading experience (Hindman et al., Citation2016), motivation is a promising lever for accelerating reading experience and proficiency. We observed that Zoology One improved students’ motivation to read, with an educationally meaningful effect size of .32 SD. This finding suggests that Zoology One may be an effective way to activate this known pathway.

Further, our analysis revealed no gender differences in motivation to read. This finding is interesting in multiple regards. First, research points to higher motivation to read among girls generally and suggests that boys’ motivation is more fragile, and more intertwined with reading skill, even in the early grades (Logan & Medford, Citation2011). The equally high levels of motivation to read among Zoology One girls and boys may indicate that the curriculum could bolster motivation among boys. This hypothesis is particularly intriguing given that we observed stronger effects from Zoology One for boys than for girls on passage comprehension. Viewed together, these two findings suggest that Zoology One bolsters both motivation and reading skill in an urban, largely minority population of boys. Given that this group has traditionally lagged in literacy, this is a potentially powerful finding. Our motivation-to-read findings are also interesting given that girls typically exhibit lower interest in science relative to boys; might early, immersive exposure that connects science with literacy counteract this phenomenon? Each of these avenues warrants further research, and additional work from this study on the impacts of Zoology One on girls’ reading preferences is forthcoming.

Acknowledgments

The authors wish to thank Katarina Suwak, Tesla DuBois, Wendy Castillo, Anushka Patel, Maurice Spillane, Viviana Rodriguez, Gwendolyn Lawson, Katie Pak, and Rebecca Davis for their contributions to this work. Key partners at the School District of Philadelphia include Tonya Wolford, Joy Lesnick, Kristyn Stewart, and Diane Castelbuono. Special thanks to Rebecca Maynard, Heather Lamke, and Jane Hileman at American Reading Company for their support of the project.

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Additional information

Funding

The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant [R305A160109] to the University of Pennsylvania. The opinions expressed are those of the authors.

References