Publication Cover
Computers in the Schools
Interdisciplinary Journal of Practice, Theory, and Applied Research
Volume 39, 2022 - Issue 1
4,001
Views
0
CrossRef citations to date
0
Altmetric
Articles

Bolstering Middle School Students’ Component Reading Skills: An Evaluation of the Lexia® PowerUp Literacy® Blended Learning Program

ORCID Icon, , &

Abstract

Unfortunately, far too many American adolescents are unable to read proficiently. The science of reading suggests explicit instruction in both word identification and language processing skills should bolster reading proficiency, but most commercial reading interventions for secondary students focus exclusively on the latter skill area. This study explored the effectiveness of the Lexia PowerUp Literacy program (PowerUp), a digital reading intervention that provides explicit instruction in word identification, grammar, and comprehension. There were 122 sixth-grade students attending low-SES schools who participated in this year-long study. Students who used PowerUp showed significantly greater gains on an assessment of word identification, syntactic processing, and basic reading comprehension skills compared to students using an alternative program that offered opportunities to apply comprehension strategies in the absence of explicit and skills-based instruction. Results demonstrate the value of instruction extending beyond comprehension strategies to incorporate the full complement of skills needed for reading proficiency.

Introduction

Achieving proficiency in reading should be an essential goal for middle school students. To understand and learn from secondary-level texts, adolescents must demonstrate efficient word identification and solid language processing skills (Torgesen et al., Citation2007). Students who are unable to read proficiently struggle in core subject areas (Schiefele et al., Citation2012), ultimately obtaining lower rates of high school graduation and weaker labor market outcomes (Fiester, Citation2013; Kutner et al., Citation2005). In the US, unfortunately, far too many middle school students are at risk for reading difficulties. According to standards set by the US Department of Education and Institute of Education Sciences, 67% of middle school students fail to meet proficient reading standards, with an alarming 81% of students from low socioeconomic status (SES) backgrounds unable to read proficiently (National Assessment of Educational Progress (NAEP), Citation2019). Clearly, this dire situation needs to be addressed. This study asks whether a commercially available digital reading program built on established reading and learning science can help strengthen low-SES middle school students’ component reading skills (i.e., skills underlying strong reading comprehension, including word identification, syntactic knowledge, and reading efficiency/fluency).

The program used in this study—Lexia PowerUp Literacy (PowerUp)—provides explicit instruction in word identification and language processing (Lexia Learning Systems, Citation2021). For this year-long study conducted in partnership with a low-SES school district, we ask whether PowerUp has a greater impact on component reading skills necessary for proficient reading in comparison to an alternative digital program that—like most instructional approaches used in middle school—focuses mainly on opportunities to practice reading and comprehension in the absence of explicit instruction.

Proficient reading in middle school

The theoretical framework behind PowerUp is the Simple View of Reading, introduced by Gough and his colleagues (Gough & Tunmer, Citation1986; Hoover & Gough, Citation1990). According to this theory, proficient reading is based on two skill areas: Proficient readers must draw on solid word identification skills coupled with language processing skills (used in understanding spoken language). Building strengths in these areas sets the foundation for proficient reading comprehension. Word identification develops through the ability to effectively use word attack strategies (i.e., breaking down unfamiliar words into segments and applying letter-sound mappings) to quickly and accurately identify words in print (Snow et al., Citation1998).

As word identification becomes increasingly efficient and fluent, more cognitive resources can be devoted to language processing skills tied to comprehension. The role of language processing in reading is multi-faceted. In addition to accessing vocabulary knowledge, the reader must utilize grammatical rules to process complex words and sentences (Carlisle, Citation2007; Nippold, Citation2017). Reading comprehension draws upon both word identification and grammatical skills, along with other elements of language processing including stored background knowledge, verbal reasoning, and an ability to employ comprehension strategies such as inference making (Perfetti et al., Citation2005).

In the context of the Simple View of Reading, non-proficient readers may show deficiencies in word identification, language processing, or both skill areas (Aaron et al., Citation2008). In a large-scale urban study, Hock et al. (Citation2009) reported that 61% of at-risk students demonstrated low performance on tasks assessing both word-level and language processing skills. Other research suggests limitations in language processing skills may be the primary contributors to adolescent reading difficulties (Hogan et al., Citation2014). For instance, Foorman et al. (Citation2012) found grammatical processing (which falls under the umbrella of language processing) accounts for over 70% of the variance in measures of reading performance for grades 4–10. In sum, this research suggests interventions providing instruction in both word identification and language processing may be necessary to promote proficient adolescent reading.

Reading interventions for middle school students

Despite an urgent need to build reading skills among middle school students, in a recent review Herrera et al. (Citation2016) identified only 33 evaluations of reading programs for middle school students that met What Works Clearinghouse standards. Just 12 of these studies were reported to show positive or potentially positive effects. The dearth of positive outcomes is unsurprising given most identified interventions primarily focused on comprehension strategies without explicitly addressing component reading skills like word identification or grammatical/syntactic processing.

Intensive interventions addressing multiple reading components might be particularly effective. In one of the reviewed studies reporting positive effects, Vaughn et al. (Citation2011) provided intensive interventions to seventh- and eighth-grade students who failed to show adequate reading gains the previous school year. Interventions occurred in 50-minute daily sessions throughout the year. Unlike the narrow scope of most evaluations in Herrera et al.’s review, the interventions focused on passage comprehension, together with word identification, vocabulary and fluency. The interventions were effective, producing higher reading comprehension scores than traditional instruction. In a follow-up study, Solís et al. (Citation2015) provided intensive intervention to ninth-graders with very low reading scores. The researcher-led intervention occurred in 80, 90-minute sessions, targeting a suite of language processing skills. Solís et al. (Citation2015) reported their intervention led to significant benefits for treatment over control students, but only when comparisons were restricted to students who had adequate word identification skills. More recently, Solis et al. (Citation2018) provided similarly intensive instruction to ninth-grade students with low reading comprehension scores. In this case, instruction focused on both word identification and language processing skills. They found significant improvements relative to control students on vocabulary measures and on one of two reading comprehension measures.

Overall, studies by Vaughn, Solís, and associates show that intensive multi-component interventions addressing both word identification and language processing skills can enhance reading skills among at-risk adolescent readers. It remains unclear, however, if such outcomes can be replicated in the absence of researcher-led intensive intervention—that is, whether classroom teachers could leverage commercial programs to achieve similarly positive results.

Current study

The current study explores if a commercially available, adolescent-targeted digital program could effectively bolster students’ component reading skills. Our study focused on Lexia PowerUp Literacy (PowerUp), a blended learning program that aims to promote both word identification and language processing skills through a series of self-paced online activities, along with scripted lessons and resources for teachers (Lexia Learning Systems, Citation2021). Educational thought leaders laud the potential of blended learning and digital technology as a means of scaling effective instructional approaches (Staker & Horn, Citation2011).

Based on the Simple View of Reading, PowerUp delivers instruction across three complementary strands: Word Study, Grammar, and Comprehension. The Word Study strand is designed to enhance word identification skills through building word attack strategies used in processing words of varying morphological complexity. As Solís et al. (Citation2015) noted, solid word identification skills are prerequisite for developing comprehension proficiency. The Grammar and Comprehension strands are more closely aligned with language processing skills. The Grammar strand addresses identifying parts of speech, recognizing various syntactic forms, and learning the roles of connective words in sentences. Success in applying rules of grammar is tied to reading performance in middle school students (e.g., Foorman et al., Citation2012). The Comprehension strand promotes additional language processing skills to help students better understand different types of passages, advance background knowledge, and employ verbal reasoning. These latter abilities are essential for students to achieve the ultimate goal in reading—to grasp the full meaning of text (Kintsch & Rawson, Citation2005).

PowerUp incorporates a host of design elements associated with effective instruction delivered via digital technology. These include individualized leveling of students, lessons presented with complementary visual and audio input, immediate feedback, scaffolding and hints when needed, and ample opportunities to practice skills (Hirsh-Pasek et al., Citation2015; Kim et al., Citation2017; Regan et al., Citation2014). The program also features elements to enhance student motivation such as timed tasks, polls, and winning “streaks” (Ryan et al., Citation2006; Ryan & Deci, Citation2000).

Two studies of PowerUp have been conducted so far, both of which found generally positive results though not fully addressing the impact the program’s instructional approach. Hurwitz and colleagues (2019) conducted a correlational study using a beta version of the program, finding middle school students using PowerUp showed year-over-year growth on a state English Language Arts (ELA) assessment. More recently, Hurwitz and Macaruso (Citation2021) reported that middle school students using PowerUp scored significantly higher on a test of general reading ability than controls receiving alternative paper-based instruction. Findings from both prior studies were based on broad measures of reading ability; thus, they do not provide insight into which specific reading skills are enhanced through PowerUp. Moreover, neither study compared PowerUp’s multi-component instructional approach with more typical comprehension-focused instruction. To further understand the effectiveness of PowerUp, we posed the following research question: How effective is PowerUp in promoting efficient and fluent word identification, syntactic processing and basic reading comprehension skills for at-risk students compared to a more conventional digital program that primarily provides comprehension practice?

Materials and methods

Study design

Six supplemental literacy classes for sixth-grade students in one school district participated in a year-long evaluation. Two classes were in one middle school (School A) and four were in a second middle school (School B). Classes within each school were randomly assigned to either a treatment group that used PowerUp or a control group that used an alternative curriculum. There was one treatment and one control class in School A, and two treatment and two control classes in School B. Both classes in School A were instructed by the same teacher (Teacher A) and a paraprofessional aide, and all four classes in School B were instructed by the same teacher (Teacher B). During the study, the teachers were asked about the relative strengths of each curriculum; in general, they said they liked both programs equally. Students’ reading skills were assessed twice—as a fall pretest in September/October and a spring posttest in May/June.

Sample

This study took place in a mid-sized, low-SES school district located in the greater Boston metropolitan area. All schools in the district received school-wide Title I support. In the school year prior to the research study, only 42% of sixth-grade students across the district met or exceeded state English Language Arts (ELA) proficiency standards.

Supplemental literacy classes in this study were intended to provide students with extra time to work on reading skills. Because district-wide reading proficiency rates were so low, all non-advanced readers were eligible for these classes. The classes met two to three days per week. In addition, all students in the study were enrolled in general ELA classes with curricula aligned to state and Common Core State Standards.

At the beginning of the school year, the sample included 135 sixth-grade students. Nine students left their class or school before the end of study. Another three either did not follow test directions, experienced extended absences, or did not assent to participate in testing and were excluded from analyses. Twelve students had raw scores or gain scores more than +3/−3 standard deviations away from the mean at one or both time points for a given reading measure; these data points were excluded from analyses. This resulted in one student being removed from the sample entirely (i.e., with outlier scores for all three reading measures). Analyses focused on the remaining 122 students with valid pre- and posttest scores for at least one measure.

The district classified 56% of students in the sample as “economically disadvantaged” based on whether the student’s family received food stamps, welfare, or Medicaid, or whether the student was in foster care. The majority of students were Hispanic (69%), followed by White (25%), Asian (4%), and Black (2%). Ten percent of the students were English Learners (ELs) and an additional 33% were former ELs. The most common native language for the full sample was Spanish (53%), followed by English (27%), Portuguese (8%), Arabic (6%), and Khmer (2%). The sample was fairly evenly split by gender (55% female).

Treatment classes

The curriculum used in the treatment classes was PowerUp (Lexia Learning Systems, Citation2021). As indicated earlier, PowerUp contains three complementary strands: Word Study, Grammar, and Comprehension. PowerUp’s student-facing online program contains 12 levels in Word Study, 7 levels in Grammar, and 16 levels in Comprehension. Each level becomes sequentially more challenging. Although the program is designed to allow students to choose which strand they will work on each session, in the present study both teachers typically assigned students a different strand to work on each class to ensure they were exposed to all three content areas.

When students begin PowerUp, they are given an online placement assessment to determine a starting level in each strand. Each strand delivers lessons via explicit instruction, which includes segmenting lessons into manageable units, modeling strategies, and providing individualized scaffolding support when needed (Hughes et al., Citation2017). If students struggle greatly in a unit, PowerUp recommends an offline Lexia Lesson® (Lesson) for teachers to administer to help students master challenging skills. After students complete each PowerUp level, the program offers paper-based Lexia Skill Builder® worksheets (Skill Builder) to provide additional practice and allow students to transfer skills to offline activities.

Teachers can access PowerUp online usage and progress data for each student through a dashboard called myLexia®. This tool provides information such as the amount of time spent working in each strand per week. Also provided are recommended Lessons and Skill Builders for each strand. Teachers were asked to check myLexia at least once per week.

To support program implementation, teachers received an Implementation Success Partnership (ISP) led by Lexia training staff. Teachers and administrators from the school buildings attended a kickoff event where they were introduced to PowerUp, and they received coaching on implementing the program and interpreting its data during the school year.

Control classes

Students in control classes used an alternate blended learning program. Like most reading interventions in middle school (Capodieci et al., Citation2020; Herrera et al., Citation2016), the program primarily served to provide practice employing comprehension strategies. It starts with a placement test to identify students’ reading ability. Each lesson requires students to read a version of an article leveled to be appropriately challenging to them. Students begin a lesson by taking a poll to express their opinions related to a topic in the article. As they read the article, the program defines select vocabulary words and offers study tools such as highlighting text and providing fields for students to identify key themes in the article. After finishing an article, students return to the initial poll to express their (revised) opinions. As a final assessment, students answer a set of vocabulary and reading comprehension questions. Based on their responses, the program determines if they are ready for a more complex article.

The teachers used different approaches to implement the program. Teacher B assigned the whole class the same article and utilized offline materials that accompanied the program, including teacher-led lessons and group discussions. In contrast, Teacher A allowed each student to select their own article to read, and they spent most of their time working independently in the online program without leveraging the program’s offline materials.

Measures

Reading skills

Students’ reading skills were assessed with three standardized tests. These tests assess word identification and language processing skills needed to support reading comprehension. Tests were chosen in part because they were brief and easy for teachers to administer. For each test, there was an initial training phase, followed by a 3-minute timed portion. Teachers gave the tests either to an entire class or to small groups of students over one or two days. Classes completed Form A of each test in the fall (pretest) and Form C in the spring (posttest).

Test of silent word fluency

The Test of Silent Word Fluency, Second Edition [TOSWRF2] (Mather et al., Citation2014) is used to measure efficient and fluent word identification skills. This test was normed with 2,439 students and has alternative form reliability coefficients ranging from .83 to .92. The TOSWRF2 presents strings of unrelated words without spaces (e.g., strictdepthmuzzlefudgefickle), and students mark off as many distinct words as possible in 3 minutes (e.g., strict/depth/muzzle/fudge/fickle). Students earn one point for each word correctly identified for a maximum score of 220.

Test of silent reading efficiency and comprehension

The sixth-grade version of the Test of Silent Reading Efficiency and Comprehension [TOSREC] (Hammill et al., Citation2014) is used to assess efficient and fluent word identification skills coupled with basic reading comprehension skills. This test was normed with 3,523 students and has alternative form reliability coefficients ranging from .81 to .95. The TOSREC consists of a series of sentences to be marked true or false (e.g., “If you cannot hear you may need to wear goggles on your forehead.”) over the course of 3 minutes. Students earn one point for each correct answer for a maximum score of 60 points.

Test of silent contextual reading fluency

The Test of Silent Contextual Reading Fluency, Second Edition [TOSCRF2] (Wagner et al., Citation2010) is designed to assess multiple skills: efficient and fluent word identification, syntactic processing and basic reading comprehension. This test was normed with 2,375 students and has alternative form reliability coefficients ranging from .82 to .89. The TOSCRF2 includes 17 passages, each containing a series of words presented without spaces. For example:

THESTRINGSONMUSICALINSTRUMENTSVIBRATETO CREATESOUNDSFORMANYSTRINGEDINSTRUMENTS THENNOTESAREFORMEDBYVARYINGTHELENGTHOFTHE STRINGBEINGPLAYED

Students separate the string into distinct words that render a coherent reading of the passage. They use slashes to mark off as many words as possible in 3 minutes. For example:

THE/STRINGS/ON/MUSICAL/INSTRUMENTS/VIBRATE/TO CREATE/SOUNDS/FOR/MANY/STRINGED/INSTRUMENTS THEN/NOTES/ARE/FORMED/BY/VARYING/THE/LENGTH/OF/THE STRING/BEING/PLAYED

Students earn one point for each word marked off correctly. They do not receive credit for marking words that do not fit the context of the passage (e.g., marking “form” and “any” instead of “for” and “many” in the above passage).

Classroom observations

Observations of PowerUp use in the classroom took place during 8 sessions (3 in School A and 5 in School B) in the second half of the school year. There were two observers assigned to each session. Part of the observation was conducted in timed intervals and the rest was based on untimed observations. The timed observations occurred once every 15 minutes (2 or 3 times per session) and provided 1-minute snapshots of student and teacher behaviors. Observers scored the proportion of students during the focal minute who were (a) seated and facing the computer and (b) focused on and actively engaging with PowerUp using a 0 to 10 scale, where scores of 0 indicated “no students” and 10 “all students. Observers also scored teachers’ degree of attentiveness to (a) students and (b) myLexia using a 0 to 10 scale, where 0 indicated “no attention” and 10 “very attentive.” An intraclass correlation coefficient (ICC) was used to assess inter-rater reliabilityFootnote1 for these four items, equaling .69, .74, .77 and .77, respectively.

The untimed portion considered general features of implementation across the class period. Observers scored the presence or absence of the following teacher behaviors: (a) providing positive feedback to students; (b) referencing external incentives (e.g., earning “school money” for progress in the program); (c) circling the classroom, and (d) responding to content-related questions. Percent agreements (Cohen’s κ values) were 70% (.40), 90% (.78), 90% (.78) and 90% (.80) for these behaviors, respectively.

Observers also scored whether teachers used Lessons and Skill Builders (percent agreements were 80% and 100%, respectively).Footnote2 Observers noted time spent working on those materials and whether they were administered to individual students, small groups, or the whole class. There was 100% agreement for amount of time and 80% agreement for the group size.

PowerUp use

We obtained records of number of weeks, number of days per week, and number of minutes per day students used the program. We also have records of instructional time per strand (Word Study, Grammar, Comprehension).

Analytic approach

The first section of the results provides descriptive statistics for fidelity of implementation. These include measures based on classroom observations and records of students’ use of PowerUp’s online component.

Gains on the three reading tests were analyzed using Analysis of Covariance (ANCOVA) models, with treatment group specified as a fixed factor. When checking that model assumptions were satisfied, we found that despite random assignment, the control group scored significantly higher than the treatment group on the TOSCRF2 at pretest, t(103) = 5.13, p < .001. We therefore used gain scores as the outcome measure in all models to produce unbiased estimates of the treatment effect (Jamieson, Citation2004). There were no other significant differences in baseline characteristics between the treatment and control groups (i.e., no differences on the other two pretests or differences in terms of gender, SES, EL status, or race distribution). We also discovered in preliminary analyses that assessment scores varied by gender. No other demographic variables were associated with assessment scores, and there were no significant interactions between condition assignment and any demographic variable. As such, we controlled for gender in our final models but did not include any other demographic variables or any interaction terms. Finally, we included school as a random factor in our models to account for the nested structure of the dataset and possibilities that shared school environment or teacher influence may have contributed to results (despite random assignment), and that results may have varied had we sampled from different schools/teachers. provides descriptive statistics—sample sizes as well as mean unadjusted pretest, posttest, and gain score—for each test.

Table 1. Mean unadjusted scores (standard deviations) for pretest, posttest, and gains on each reading test for treatment and control students.

Results

Implementation fidelity

Classroom observations

Student behaviors

For the item asking if students were seated and facing the computer, the average rating was 7.78 out of 10 (SD = 1.70). The average rating for the item asking if students were focused and actively engaging with PowerUp was 6.45 out of 10 (SD = 1.61).

Teacher behaviors

The average rating for how attentive teachers were to students was 6.70 out of 10 (SD = 2.58). This rating was higher for Teacher A (9.00, SD = 1.00) than Teacher B (5.69, SD = 2.41), t(21) = 3.47, p = .002. The average rating for how engaged teachers were with myLexia was 3.35 (SD = 3.88). In this case Teacher B had a higher rating (5.00, SD = 3.89) than Teacher A (0.29, SD = .76), t(18) = −3.13, p = .006. These outcomes show that the teachers were either paying attention to the students, attending to data generated in the program, or sometimes both. It was observed that teachers provided positive feedback to students in 100% of the sessions. They also referenced external incentives, circled the classroom, and responded to content-related questions in 88%, 75%, and 63% of the sessions, respectively.

Offline materials

Teachers’ use of Lexia’s offline materials was limited. Teacher A used Lexia Skill Builders in all three sessions. Her method of delivery differed each time, ranging from whole class to small group to individual administration of these resources. Students worked on Skill Builders for a minimum of ten minutes. Teacher A was not observed using Lexia Lessons. In contrast, Teacher B delivered two Lessons in one session. These lessons were given in a one-on-one setting that lasted less than five minutes. She was not observed using Skill Builders.

PowerUp use

Students used PowerUp for an average of 24.38 weeks (SE = .47) over the school year. The mean number of days per week using the program was 1.71 (SE = .03) and the mean number of minutes per day was 28.03 (SE = .61). The number of hours spent working in each strand over the school year averaged 6.12 (SE = .41) for Word Study, 6.92 (SE = .30) for Grammar, and 6.42 (SE = .40) for Comprehension.

Reading gains

Toswrf2

Results of the ANCOVA revealed no treatment effect, but there was a significant effect of gender (F(1,107) = 6.062, p = .015, partial η2 = .054). The adjusted mean gain score for females (15.08, SE = 2.84) was higher than for males (5.42, SE = 2.96). The fact female students benefited more from instruction is consistent with the finding that persistent reading difficulties are more prevalent in male students (Rutter et al., Citation2004).

Tosrec

The ANCOVA showed no treatment effect, but again there was a significant effect of gender favoring female students (F(1,111) = 4.606, p = .034, partial η2 = .040). The adjusted mean gain score for females (2.76, SE = 1.30) was higher than for males (−1.15, SE = 1.45).

Toscrf2

In this final analysis, the ANCOVA revealed a significant effect of treatment (F(1,101) = 12.311, p = .001, partial η2 = .109, Cohen’s d = 0.69). The treatment group showed an adjusted mean gain score of 26.66 (SE = 6.04) compared to a decline of −2.50 (SE = 6.07) for the control group. No other factors were significant in this model.

Discussion

Given that so many middle school students are unable to read proficiently (National Assessment of Educational Progress (NAEP), Citation2019), this study asked whether the digital program PowerUp could be used to advance reading skills for middle school students in a low-SES school district. Based on the Simple View of Reading, PowerUp delivers explicit instruction in word identification and language processing skills. Notably, in contrast to intensive, daily instruction employed in prior research (e.g., Vaughn et al., Citation2011), our study shows that this commercial program is feasible for teachers to implement with reasonable fidelity in the context of typical classroom instruction. Our main finding was that PowerUp led to significantly greater gains for the treatment group over the control group on the TOSCRF2—a test requiring word identification, syntactic processing and basic reading comprehension skills. Because this test assesses three skill areas, compared to the other tests which only measure one or two skill areas, we argue that it was the most complex test utilized in this study, making students’ learning gains quite noteworthy. The obtained effect size on the TOSCRF2 (Cohen’s d = 0.69) is more than 2.5 times larger than estimates of effect sizes (.26) found on skill-based measures in prior middle school interventions (Lipsey et al., Citation2012). Because treatment and control classes were taught by the same teacher using digital programs with online and offline components, we believe the observed effect can be attributed to differences in the programs (explicit, multi-component instruction vs. comprehension practice) and not to potential confounding factors such as teaching style or novelty of technology.

The finding that the treatment group showed robust gains on the TOSCRF2 dovetails well with skills addressed in PowerUp. Success on this task requires strong word identification, syntactic processing and basic reading comprehension skills. These skills are essential to advance in reading proficiency (Nippold, Citation2017) and are, in fact, the areas addressed in PowerUp’s Word Study, Grammar, and Comprehension strands. This result aligns with other research showing students demonstrate strongest performance on tests that best reflect the contents of the intervention they complete (Hurwitz, Citation2019). Without explicit instruction in word identification or syntactic processing—skills utilized on the TOSCRF2—the control group did not demonstrate gains.

One could argue that the group discrepancy in gains on the TOSCRF2 may be due, in part, to the treatment group starting with lower pretest scores than the control group and therefore having more room to improve during the intervention. Though unusual, pretest differences of this sort can occur in the context of random assignment, especially when only a small number of schools or classes are involved (e.g., Hooper et al., Citation2013). Nonetheless, it should be noted that ceiling effects did not contribute to the absence of gains for control students. The pretest mean on the TOSCRF2 for the control group fell near the 75th percentile. There was certainly room for control students to show gains on this task.

Unlike the TOSCRF2, gains on the TOSWRF2 and TOSREC were similar across groups. These results are somewhat surprising given that PowerUp provided explicit instruction in skill areas covered by these tasks—word identification on the TOSWRF2 and word identification and basic reading comprehension on the TOSREC—while the control curriculum did not. The finding on the TOSREC is easier to interpret: The alternative curriculum allows students to practice comprehension strategies; thus, treatment and control students may have been equally prepared for the TOSREC. The finding on the TOSWRF2 is more surprising, as PowerUp offers explicit instruction in word identification skills, while the control program does not. PowerUp’s Word Study strand focuses on strategies for processing morphologically complex words. These advanced strategies, however, may have been underutilized on the TOSWRF2, which mainly uses mono-morphemic words. Thus, it appears that the wealth of reading opportunities offered in the alternative program was as beneficial on the TOSWRF2 as instruction in PowerUp. That is, control students may have gained some benefits in basic word identification implicitly through reading texts, even in the absence of explicit instruction.

Limitations and future research

One limitation in this study applies to the reading tasks we chose to use. Our aim was to extend the work of Hurwitz and associates (2019, 2021), who administered broad assessments of reading ability. In the present study, we decided to examine how use of PowerUp can impact component reading skills—word identification, syntactic processing and basic reading comprehension—needed to become successful readers (e.g., Hogan et al., Citation2014; Nippold, Citation2017). We did not include a more global measure of reading ability. Thus, in the present study we lack data addressing how well use of PowerUp affected overall reading ability compared to the alternative program.

As another limitation, we cannot determine more precisely which elements of PowerUp made the greatest contributions to gains seen on the TOSCRF2. This test differs from the others in that it requires the greatest degree of syntactic processing. Thus, it is possible that instruction in the Grammar strand, which addresses syntactic processing, may have contributed to outcomes on the TOSCRF2 more than PowerUp’s other strands. Future research could use an assessment that solely focuses on grammar to disentangle whether treatment effects mainly were driven by instruction in the Grammar strand or whether the full complement of instruction made the difference.

In addition, we could have benefited from gathering more information about instruction in control classes. Without observational or system log file data, we lack key information about the control classes and thus cannot fully eliminate alternate explanations of our findings. It is possible, for instance, that teachers deviated from the control curriculum and provided explicit instruction in word identification (inspired by PowerUp or drawing from other resources), which could have led to lack of treatment effects on the TOSWRF2 and TOSREC.

As a final limitation, the district did not reserve the supplemental literacy classes for students with the greatest needs. This resulted in some relatively proficient readers included in the sample, which may have contributed to the imbalance across conditions. Future research should exercise stricter inclusion/exclusion criteria based on pretest scores to provide a more robust test of the program as an effective and feasible intervention for non-proficient readers.

Conclusions

To achieve academic success, middle school students must be able to effectively read complex materials, and deficiencies in word identification and/or language processing skills that impede proficient reading need to be addressed (Nippold, Citation2017). The current study shows PowerUp—which provides multi-component instruction—can be effective in supporting reading skill development in middle school students. It is encouraging that teachers can implement such a program with reasonable fidelity in their classrooms.

Acknowledgements

We extend our gratitude to the participating administrators, teachers and students. We thank Robert McCabe and Max Tuefferd for helping recruit the schools, Peggy Coyne, Doug Meyer and Sara Morin for providing ISP support, and Pamela E. Hook, Suzanne Carreker, and Melissa Feller for their contributions to study design and our theoretical understanding of PowerUp. We also acknowledge members of Lexia’s Research & Analytics and Curriculum & Assessment departments for conducting classroom observations, Ellen Macaruso, Sara Clark, and Christina Arlia for scoring tests and data entry, and Lisa Sullivan for editorial assistance.

Disclosure statement

In accordance with Taylor & Francis policy and our ethical obligation as researchers, we report that three authors are employed by, and one is a paid consultant to, a company (Lexia Learning Systems LLC, A Cambium Learning Group Company) that may benefit from the research reported in the enclosed paper. We have disclosed those interests fully to Taylor & Francis, and we have in place an appropriate plan for managing any potential conflicts arising from this involvement.

Notes

1 To have sufficient data to calculate reliability, the 8 observations conducted in the present study were combined with 2 additional observations that occurred at a nearby school participating in a different PowerUp evaluation.

2 Because teachers only used PowerUp’s offline materials in 4 of the observed classes, we are unable to meaningfully calculate Cohen’s κ values for these items.

References