16,735
Views
59
CrossRef citations to date
0
Altmetric
Articles

Replication is important for educational psychology: Recent developments and key issues

ORCID Icon & ORCID Icon

Abstract

Replication is a key activity in scientific endeavors. Yet explicit replications are rare in many fields, including education and psychology. In this article, we discuss the relevance and value of replication in educational psychology and analyze challenges regarding the role replications can and should play in research. These challenges include philosophical, methodological, professional, and utility concerns about replication in education and the social sciences more broadly. Finally, we discuss strategies that may address these concerns in educational psychology research.

Replication is the intentional repetition of previous research to confirm or disconfirm the previous results, serving as a de facto reliability check on previous research. Informing stakeholders about which results can be repeated—and in what circumstances—are the chief value that replications contribute to research and the public at large. Successful replication of research on positive outcomes associated with a reading intervention, for example, provides educators and policymakers with confidence that can justify the investment of scarce public resources in implementation of that intervention. Conversely, a sensational research result that cannot be replicated provides information to stakeholders that may prevent unnecessary resource and opportunity costs.

Replicability is therefore a cornerstone of the research endeavor in educational psychology. It tends to occur in one of two forms, direct or conceptual replications. When attempting a direct replication, researchers are attempting to follow the original study’s methods as closely as possible in an effort to arrive at similar results. The goal of a direct replication is not a thumbs-up/thumbs-down decision; rather, as Simons (Citation2014) notes, “The end result is not a judgment of whether a single replication attempt succeeded or failed—it is a robust estimate of the size and reliability of the original finding” (p. 76). In contrast, the purpose of a conceptual replication is to examine the theoretical soundness of a particular finding or set of findings, with less focus on repeating exact methods from the original study. Conceptual replications purposefully alter factors such as participant demographics, operationalization of dependent variables, or study context (see Schmidt, Citation2009, Figure 1). Although there is considerable debate about the value of direct versus conceptual replication (see Simons, Citation2014; Stroebe & Strack, Citation2014)—a debate we explore in greater depth later in this article—these two categories are the most common and straightforward way of thinking about types of replication.

Regardless of form, replication is rare, perhaps far too rare. In a series of studies, we have found low replication rates in the published research bases of psychology, education, special education, gifted education, and criminology, ranging from 0.13% in education to 1.07% in psychology (Makel & Plucker, Citation2014b, Citation2015; Makel et al., Citation2012, Citation2016; Pridemore et al., Citation2018). Other researchers, using slightly different methods, have arrived at roughly similar rates (Lemons et al., Citation2016; McNeeley & Warner, Citation2015). Although we have not specifically examined the presence of replication studies within educational psychology, the field’s major journals were included in the education study (Makel & Plucker, Citation2014b), suggesting replication is uncommon in educational psychology. If replication is a foundational activity within the field yet rarely occurs, it is fair to question whether the field’s impact is being unnecessarily limited.

While exploring issues related to replication over the past decade, we often found ourselves in faculty meetings, conference sessions, and casual conversations with colleagues who asked questions about the idea of replication (philosophy of replication), strategies for conducting replication studies (methodology of replication), the professional feasibility of replication (professional implications of replication), and replication’s utility in the field of educational psychology (utility of replication). In this article, we attempt to summarize researchers’ current understandings in all four areas, with the caveat that researchers have not yet answered all of the concerns and questions about replication successfully. In each section, we provide summaries and analyses of the most recent thinking and research on replication, in addition to an examination of questions that have yet to be answered. Our hope is that this article equips readers with perspective to form deeper understanding of how replication should factor into their own work as well as the field more broadly.Footnote1

Philosophy of replication

Replication has well-defined, epistemological purposes

Some educational psychologists have questioned the philosophical basis for replication. However, the rationale for replication research has strong epistemological foundations related to the nature of scientific knowledge. Indeed, Collins (Citation1985) called replication the Supreme Court of science. Schmidt (Citation2017) is more direct, noting that, “… a single observation cannot be trusted,” and that “replication … is capable of transforming an observation into a fact, or piece of knowledge” (p. 236, emphasis in original). Several scholars have recently argued replication plays an important role in theory building and theory assessment (Guest & Martin, Citation2021; Irvine, Citation2021; van Rooij & Baggio, Citation2021).

Perhaps the most cynical framing of the epistemological value of replication can be drawn from Planck’s (Citation1949) observation that science advances one funeral at a time. In his 1949 autobiography, he wrote that, “a new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die and a new generation grows up that is familiar with it” (p. 33). More to the point, he observed, “An important scientific innovation rarely makes its way by gradually winning over and converting its opponents” (p. 97), an idea reinforced by both philosophers of science (e.g. Kuhn, Citation2012) and empirical evidence (Azoulay et al., Citation2019).

But need this historical reality be destiny? Although Azoulay et al. also note the potential value of having eminent gatekeepers control the flow and dominance of ideas, especially within nascent fields, educational psychology is a mature field. Regular and planned replication is one tool that can help education science self-correct more quickly. Why relegate advances in educational psychology to the passage of time when we have open science tools at our disposal, replication first among them, that can help advance the field’s knowledge faster and more efficiently? Doing so will allow the field to be more transparent and democratic about what we know, what we do not know, and when the field is heading down theoretical rabbit holes that do not translate to practice.

The philosophical value of replications in educational psychology is underscored via a series of intentionally strawperson questions. First, to examine constraints on generality (Simons et al., Citation2017) one can ask, “If we conducted every study with undergrads, could we generalize those results to all students, of all ages, in all schools?” Second, to determine the need for some replication, one can ask, “Can we assume with a reasonable degree of confidence that we know when results would generalize across specific contexts, ages, and cultures, or do we need to collect data to be certain?” This establishes scenarios where replications add value. Third, to establish confidence in the reliability of existing research, one could ask, “Do you believe that the existing academic publication process is 100% error free?” This helps demonstrate fallibility with published findings. A fourth question, to determine confidence in the field’s body of research on a specific topic, is: “Do you think having more confidence in how research on a given topic generalizes, in which settings and for which students, would help you make more effective decisions?”

Although we acknowledge that epistemological questions remain about replication and how to appropriately interpret and decide when to use them (see, e.g., Gervais, Citation2021), these questions have evolved over the past generation from “Do replications have value?” to “When do we need replications and how can we structure them to provide maximum value?” This is a non-trivial, philosophical advance that helps the field provide greater value via real world application.

Increased use of replication is needed

The field has provided value for many decades, why is an enhanced focus on replication needed now? Calls for replication are not new, and problems in other, related fields suggest educational psychology could learn from errors in those fields rather than waiting to suffer from them itself. For example, concerns about replicability have been raised about the research base in a broad range of fields, from economics to medicine to the life sciences (see Zwaan et al, Citation2018). Within the social sciences, psychology has long been considered to have a replication crisis (Pashler & Harris, Citation2012; Rosenthal, Citation1969; Schlosberg, Citation1951). This has gained attention in recent years due to suspicions of research misconduct by several prominent researchers. Whether one believes this situation to be important or overblown, a good crisis should never be wasted. In this vein, Vazire (Citation2018) has reframed the replicability crisis into a credibility revolution that embraces several methodological strategies, including conducting more replication research.

The presence of a credibility crisis is certainly applicable to education research. The first author once testified before a state senate education committee and was surprised to hear senators sarcastically noting that researchers can make their studies say whatever they want. From an empirical perspective, Merk and Rosman (Citation2019) found evidence that student-teachers held a “smart but evil” stereotype about education researchers, “as the authors of scientific studies … are perceived not only as less benevolent, with less integrity, but also as having more expertise in contrast to practitioners. This is an intriguing finding, as it suggests that student-teachers hold a kind of distrust in scientists” (p. 6). We do not believe such perceptions by relevant stakeholders help the scientific endeavor in educational psychology. An increase in credibility among stakeholders is one possible pathway toward gaining appreciation for the accuracy and usefulness of our efforts.

Most studies are not replications

Several scholars have argued that most psychological studies are de facto replication studies, given that they investigate similar theoretical constructs using different methods than earlier studies; in other words, most papers are conceptual replications (Smith et al., Citation2017; Stroebe & Strack, Citation2014). Chhin et al. (Citation2018), in a study of IES-funded projects, found no direct replications but concluded that nearly half of funded applications involved conceptual replications.

We understand the temptation to consider most papers to be undeclared, conceptual replications. But failing to label them as such (and take advantage of open science strategies such as preregistration) disrupts the research process, making it harder for consumers of research to know what has and has not been replicated (Hodges, Citation2015; Reich, Citation2021/this issue).

For these and other reasons, Simons (Citation2014) argued that direct replications are the only way to verify the reliability of results, a position that is attracting growing levels of support (e.g. Machery, Citation2019; Nosek & Errington, Citation2019). Hüffmeier et al. (Citation2016) offer a more nuanced typology involving exact replications (direct replication conducted by same researchers), close replications (also direct but by different researchers), constructive (also direct but a similar study modified in a small number of ways to assess robustness of original effect), conceptual under lab conditions (conceptual attempt to study theory), and conceptual under field conditions (also conceptual and attempting to study robustness of theoretical effect). An advantage of the Hüffmeier et al. approachFootnote2 is that even if one assumes most empirical studies are conceptual replications, this typology stresses the importance of systematic and sequential replication approaches to advancement of psychological knowledge, and that conceptual replication without earlier forms of supportive, direct replication are not adding meaningfully to a field’s research base (see Zwaan et al., Citation2018, for a similar argument). Similarly, Irvine (Citation2021) argued that even the best conceptual replications have a low capacity for theoretical payoff in most circumstances.

This importance of direct replication in no way implies that conceptual replications lack value, as they help assess generalizability and establish boundary conditions for empirical effects. Comparing the relative merit of each type of replication should not be about which is better or more important. Rather, which form of replication is most useful for a given effect at a given time? If a finding is wholly novel, direct replication may be more useful prior to conceptual replication. If a finding has been observed several times in one population, a conceptual replication may be more informative.

That said, conceptual replications may be more informative when they are conducted systematically through purposefully altering a single variable rather than through changing many variables. If the population, independent variables, and outcome variables are all changed, making strong conclusions about replicability of original studies becomes complicated if not impossible. In a related vein, haphazard conceptual replications may provide little value. For example, is there theoretical rationale for assuming there may be a difference in how left-handed and right-handed students respond to math tutoring intervention? If not, such a conceptual replication may not be worth pursuing.

Regardless of how educational psychologists feel about conceptual versus direct replication (or any other classification system for replication), they should explicitly state their intent; when they fail to state their intent to replicate a theoretical position or empirical finding, it becomes difficult for the field to move forward. Explicit intent can be easily systematized, with authors including the following language in their papers: “We are attempting to [directly/conceptually] replicate the methods used by [citation] in their study of [key concepts or interventions].” In addition, direct replications should state, “We kept all methods as similar to the original as possible, with the following exceptions.” Conceptual replications should state, “Our study methods differ from the study to be replicated in the following ways.” Such language would add substantial clarity with minimal length and should be included in both the abstract and body of research papers.

The contextual nature of education is not a barrier to replication

Given that children are unique and, therefore, have unique experiences, concerns about generalizability (or lack thereof) are important issues given the variability of students and their contexts. A belief that no research finding can generalize may be the most extreme version of this concern. We have encountered such views, often sociocultural in nature (see Turner & Nolen, Citation2015), but do not find them compelling. There are numerous educational findings that have been replicated and generalized across contexts, including the results of sociocultural inquiry (e.g. Coalition for Psychology in Schools and Education, Citation2015). Anyone who lacked confidence in generalizability would likely have to believe that educational psychology should consist only of case studies or action research. Moreover, from this perspective, publishing these case studies and action research would have little value beyond descriptive biography because they would not inform practice in other contexts.

Methodology of replication

Both replication and meta-analysis are necessary

Meta-analysis and replication address research quality issues and are complementary processes, but they have distinct purposes and therefore address research quality in different ways (Patall, Citation2021/this issue; Valentine, Citation2019; Williams et al., Citation2017). Meta-analyses synthesize previous research, whereas replications seek to verify whether previous research findings are reproducible and, therefore, accurate. Wide variance in construct definition, instrumentation, sampling, and data analysis, among other factors, can result in a diverse pool of studies within a meta-analysis, none of which may have been previously replicated. A carefully conducted meta-analysis of irreproducible studies is of no value (see Carter et al., Citation2019).

Moreover, meta-analyses and replications solve different problems. Meta-analyses help solve the problem of heterogeneous results (which may be driven by moderators such as using different samples or measures). However, replications help assess (and address) experimenter bias. Namely, if a researcher has a bias (e.g. wants to find a specific result, will be rewarded if certain results are obtained), meta-analyzing multiple studies from the same lab will amplify the bias. From this perspective, independent replication is necessary but insufficient for meaningful meta-analysis.

Replication is relevant to all forms of research

Some may view replication as applying only to experimental research. From this perspective, replication may be viewed as “other people’s problem” by researchers who do not conduct experimental research. However, replicability and reliability of results are important across all empirical approaches to study design and data collection. Across a range of disciplines, plentiful examples are available of quantitative, non-experimental research being subject to replication attempts (e.g. Kanai, Citation2016; Piffer, Citation2019). For example, Ebersole et al. (Citation2016), as part of a Many Labs collaboration, attempted to replicate both experimental and correlational effects in social psychology across undergraduate research pools to determine whether variations in participation in research across the course of a semester produced significantly different effects, finding little evidence they did. Assessing whether the timing of an event influences consequences is of great relevance to all of education and educational psychology. Ironically, we once had a descriptive study (on replication, no less!) replicated by a team that was studying the same issues at the same time for the same journal special issue (see Lemons et al., Citation2016; Makel et al., Citation2016).

Regarding qualitative methods, most readers will not be surprised that this is an area of considerable debate, with some strong, negative views of the importance of replication (and even the concept of replicability) to qualitative research (Pratt et al., Citation2020). However, perspectives are emerging among qualitative researchers that replication is important. Qualitative researchers appear to be focusing on the clear communication of methods to facilitate replicable qualitative work (Anczyk et al., Citation2019; Schindler et al., Citation2020; Steinhardt, Citation2020).

The application of replication to qualitative research is relatively recent, and therefore the number of unanswered questions in this area of scholarship are numerous and important. For example, does replication apply at all to studies using grounded theory or critical race theory? How can qualitative research approach independent replication when individual subjectivity or background personal context plays a central role in many approaches to qualitative interpretation? How can qualitative replication help build or assess theory? Developing answers to these questions will help inform when and how replication can play a role in qualitative research.

In an examination of replication within qualitative research, Leppink (Citation2017) concludes that “perhaps we should no longer think in terms of qualitative–quantitative divides but rather in terms of more-less replicable distinctions, and do all that is possible to document all choices and decisions made throughout a study to enable others to replicate our work” (p. 100). This recommendation is applicable to all forms of research within educational psychology and the learning sciences. If a research result (whether quantitative or qualitative) is so narrow and fragile that it can never be found again (even by the same research team), that result would be of little use to practitioners and policymakers. Conversely, for example, a series of case studies that all found evidence, hypothetically speaking, that teachers find differentiation difficult in heterogeneously-grouped classrooms should be viewed as successful conceptual replications—and would provide important information to stakeholders.

Research rigor involves more than replication

We often encounter colleagues who note that increased use of replication within educational psychology will not automatically lead to huge increases in the quality of the field’s research. Indeed, increased use of replication is a necessary but insufficient strategy for improving research quality (see also Nosek et al., Citation2021; Schmidt, Citation2017). In recent years, a range of strategies have been suggested and refined for improving how research is conducted and communicated, many falling under the banner of open science. These wide-ranging approaches include preregistration of hypotheses (i.e. publicly sharing a study’s methods before the study is conducted), open data, and meta-analysis, among many others (Chambers, Citation2017; Crüwell et al., Citation2019; Fleming et al., Citation2021/this issue; Gehlbach & Robinson, Citation2021/this issue; Makel & Plucker, Citation2017; Nosek et al., Citation2015; Reich, Citation2021/this issue; Spellman, Citation2015). To facilitate these practices, web services are available that allow for posting of preregistrations, data sharing, and posting of pre-prints (e.g. https://osf.io/, https://edarxiv.org/). Many of these approaches to improving the rigor of research have been embraced by educational psychologists.

For example, a recent issue of AERA Open was dedicated to publishing registered reports and included replications (Reich et al., Citation2020), with the vast majority being educational psychology research. The studies were primarily inferential and experimental in nature but included descriptive work (e.g. Peters et al., Citation2019). In the ensuing commentaries (see Reich et al., Citation2020), many of the authors noted the advantages of approaching research from an open science/heightened credibility perspective. For example, Merk and Rosman (Citation2020) noted that preregistration requires more detailed methods sections in papers, and that the discussion section was “more honest and vivid as we could, for example, give sharper opinions” (p. 1). Another benefit is that the detailed method sections typically found in preregistered studies facilitate future replication attempts. Any progress toward heightened research credibility, regardless of whether one refers to it as “open science” or some other term, is a positive development for educational psychologists and the educators and families whom our work benefits.

Development and testing of theory may be an even more important component of improving research rigor, but that does not mean that replication is irrelevant (Wentzel, Citation2021/this issue). Some have proposed that replications are part of the process of strengthening the empirical portion that informs the cycle of theory development and assessment (van Rooij & Baggio, Citation2021). Irvine (Citation2021) argued that without sufficiently advanced theory, replications may have limited value but that they can serve as a tool to improve theory. van Rooij and Baggio (Citation2021) share the vivid example of knowing apples fall from trees (a replicable effect) but needing the theory of gravity to explain the effect and provide true understanding. In addition, they argued that the development of theory is not only the ultimate priority but also the foundation of all future empirical efforts (cf. Eronen & Bringmann, Citation2021).

Professional implications of replication

Replications are good for one’s career

Regardless of whether one believes replications can benefit the field, they may wonder if conducting replications will further their own careers.Footnote3 Sufficient incentives exist within education to allay this concern: Replications are cited, unsuccessful replications can be well received, and external funding for replications is growing.

As shown in Makel and Plucker’s (Citation2014b) study of replication in the Top 100 education journals as ranked by impact factor, the median number of citations for articles being replicated was 31 (range from 1 to 7,644) while the median citation count for the replicating paper was 5 (range from 0 to 135). Although the replicating papers were cited far less often, the median citation count for those papers was higher than the impact factor for all of the 100 journals at that time.

Some researchers may be concerned that conducting unsuccessful replications will give one the reputation of being a critic within their field, which could cause problems when going up for promotion, being considered for fellowships and other honors, and other aspects of a profession where peer assessment is valued and important. Although failed replication attempts tend to capture media attention and be noted on social media, replications in the social sciences tend to be confirmatory. Successful replication rates tend to be over 65% in education, psychology, economics, and special education, with some rates in excess of 80% (Camerer et al., Citation2016; Klein et al., Citation2014; Lemons et al., Citation2016; Makel & Plucker, Citation2014a; Makel et al., Citation2012, Citation2016; cf. Camerer et al., Citation2018). Although self-replications tend to be more successful than third-party replications (Makel et al., Citation2012; Makel & Plucker, Citation2014b), they are still often successful even when conducted by third-party researchers (71 vs. 54%, respectively, Makel & Plucker, Citation2014b, Table 2). But we note that these estimates are built on replications that were generally not preregistered. Replicability estimates of preregistered replications are often lower (e.g. 35%; Open Science Collaboration, Citation2015), which may result from the many ways success of a replication can be determined (see Gervais, Citation2021, for 11 such examples).

Replication research is also increasingly attractive to external funders. Howe and Perfors (Citation2018) note that “grant agencies greatly value novelty, but they even more greatly value reliable science; a novel finding can have a long-term impact only if it is true” (p. 25). For this reason, it is not surprising that the National Science Foundation and the Institute of Educational Sciences (IES) now have regular funding competitions for replication research. Replications have become such a prevalent part of funding projects that these agencies have jointly published companion guidelines on conducting replication research (National Science Foundation & Institute of Educational Sciences, U.S. Department of Education, Citation2018). This type of action also suggests a cultural shift is occurring, making replications both a public good for the field and good for one’s career, too.

Finally, replication research may lower opportunity costs within the busy careers of educational psychologists. Consider some worst case scenarios: A researcher spends years working on a concept that, over time, others cannot replicate. Or a group of graduate students devote their research time pursuing topics that appear enticing but ultimately fail to replicate. Neither situation is good for one’s career, but contrast those scenarios with that of an early career researcher who conducts a series of replications on foundational studies on a particular construct, some successful and some not, that help improve theory on that construct and create more effective interventions. The latter scenario is of greater benefit to both the researcher and the field.

Replication is pro-innovation

Researchers work in communities that reward creativity and innovation. Some may be concerned that replications distract or detract from creative contributions. Under no circumstances is this accurate. This question is often asked or implied by journal editors, who note that they are resistant to publishing replication papers because they want their journal to focus on creative additions to the research literature. The attraction to the shiny new object in research is well-known (e.g. Fanelli, Citation2010; Hodges, Citation2015; Howe & Perfors, Citation2018; Makel, Citation2014), but this attitude confuses novelty with creativity. Although definitions of creativity almost always involve characteristics of novelty, originality, or uniqueness, they also require usefulness or utility (e.g. Plucker et al., Citation2004; Simonton, Citation2012).

If an idea cannot be replicated, arguing that the idea is useful, especially in an applied field such as educational psychology, is a difficult case to make (Makel & Plucker, Citation2014a). Furthermore, given that innovation can be conceptualized as creativity taken to scale, a finding that cannot be replicated—regardless of one’s chosen definition of replication—can never inform innovative practice. At best, it can only misinform practice and mislead educators and policymakers. An irreplicable research result within educational psychology is neither creative nor innovative; a replicable result is likely to be creative, and if taken to scale, innovative.

A related aspect involves the primacy effect as applied to research, in which the first study published on a topic is assumed to be the most valuable. Gelman (Citation2017, Citation2018) has proposed a time-reversal heuristic—a thought experiment in which researchers consider how their evaluation of a theoretical effect changes if an unsupportive direct replication were published first, and the original, exploratory, supportive study were published second. Most people would be skeptical of the second study’s results, when in fact both should be considered equally when evaluating the research on the effect. The time-reversal heuristic would have us consider usefulness before novelty, an admittedly creative approach to assessing creativity in research.

Replication need not be an adversarial act

Is conducting a replication an aggressive act? Not necessarily, given that replication attempts can be perceived as a form of flattery, in that a researcher’s colleagues are paying attention to their work. In a field with little to no replication, it is human nature to find any attempt to replicate your work to be suspicious if not adversarial. Until the research culture changes to embrace replication and other open science strategies, replication will always have at least a tinge of an adversarial feel. But researchers can influence the degree to which replication is viewed as constructive versus aggressive.

For example, compare the reactions of two distinct sets of authors whose work has recently been the subject of unsuccessful replication attempts. In the first, a study of the impact of human-like avatars on decision-making in a technology context was replicated with mixed and generally negative results (Simmons & Nelson, Citation2020). The authors of the original study responded politely, but their response included several paragraphs exploring why the replication was likely flawed.

However, in the other example, Stafford (Citation2018) authored a study that did not find evidence of stereotype threat among chess players. A replication of the study found considerable evidence of stereotype threat (Smerdon et al., Citation2020). The author of the original study acknowledged the failed replication, noted ways in which the replication improved on his original study, provided even-handed analysis on the possible causes for the divergent results, and even defended the replication authors against subsequent criticisms of the replication (Stafford, Citation2020). This commitment to building an accurate research base rather than reflexively defend one’s personal investment in their research is commendable and serves as a model for educational psychologists.

Many researchers have proposed ways to make replication less adversarial, from systematic approaches involving changing research culture to specific replication strategies. Gernsbacher (Citation2018) has pointed to reciprocal replications—in which teams of researchers attempt to replicate each other’s work—as one path forward. Tierney et al. (Citation2020) proposed a creative destruction approach that conceptualizes replications as the act of replacing original results with revised results that are more powerful or more precise. Regardless of the approach, having more frequent replications in educational psychology will help make them less of an aberration, more difficult to interpret as a personal attack, more of a key aspect of the educational psychology enterprise, and more successful in improving interventions and the theories on which they are based.

Educational psychology is starting to value and support replication

As the preceding sections suggest, the conventional wisdom on the value of replication within educational psychology is changing, albeit slowly and unevenly. There are reasons to be optimistic about changing attitudes within the field toward replication and open science practices (see Fleming et al., Citation2021/this issue; Mellor, Citation2021/this issue). For example, editors from some specialty journals in and related to the field have published editorials endorsing and implementing open science practices (e.g. Adelson & Matthews, Citation2019; Hodges, Citation2015; Spector et al., Citation2015), and journals that feature work from the field, such as AERA Open, The British Journal of Educational Psychology, Exceptional Children, and Journal of Educational Psychology, accept registered reports. But to our knowledge few other educational psychology journals have taken similar steps and acceptance of replications is less clear. When all of the field’s journal editors state unequivocal support for replication and open science approaches to research, many of the professional concerns surrounding replication will dissipate.

Increasing calls for innovation in how we teach research methods at the graduate level will also hopefully result in lasting culture change (Gernsbacher, Citation2018; Spector at al., 2015). For example, Kochari and Ostarek (Citation2018) have called for making direct replications a required research activity for doctoral students, and Yeo-Teh and Tang (Citation2021) have suggested a research ethics and integrity course for doctoral students that emphasizes the responsible conduct of research, including issues such as preregistration and other open science practices. Using graduate education as a major intervention point allows educational psychologists to prepare the next generation of innovators as opposed to slowly adopting and adapting new research practices after everyone has already entered the field.

Utility of replication

It is possible to determine if a replication is “successful”

We once led a replication workshop at a federal agency where the participants questioned whether it was possible to determine if a replication attempt was successful. However, in the absence of replication, researchers routinely make judgments about the accuracy of collective bodies of research, dealing with variables in study quality and methodology, but with little information about whether a given finding is replicable. If we can determine success and failure of original studies, we can determine success of replications.

Simple heuristics such as comparing whether p-values are similar have their limitations, including ignoring whether the magnitude of the effect is the same (Valentine, Citation2019). But this limitation is not unique to replications; it holds true for original research as well. Several approaches that could be taken to assess replication success have been proposed, each with its strengths and limitations (Schauer & Hedges, Citation2021), including interpreting confidence intervals (Jacob et al., Citation2019; Zwaan et al., Citation2018), a combination of sample and effect sizes (Simonsohn, Citation2015), and a replication Bayes factor (Zwaan et al., Citation2018). Others have noted that more than one replication may often be needed for unambiguous interpretation of effects (Hedges & Schauer, Citation2019) or that reproducible results may not lead to a “convergence to scientific truth” (Devezer et al., Citation2019, p. 17) because of research and statistical weaknesses as well as the complex nature of the world.

We have no issue with attempts to create objective criteria for evaluating replication results but offer a broad approach that is more direct: Do the results give the reader more or less confidence in the validity of the original findings? More objective and statistical approaches, such as those mentioned above, may inform the answer to this question, but in the end stakeholders will make subjective judgments (e.g. importance of topic, theoretical underpinnings of an intervention, quality of study design, magnitude of effects) about the results of any study, and replications are no different. Just as a specific p value should not automatically equate to action, nor should any other generic statistic. A more interpretive approach allows educators to consider local context when evaluating, for example, whether efficacy research on a specific intervention has been replicated. For example, the replications may be sufficiently supportive of efficacy to allow the intervention to be used in a small-scale, experimental program but not sufficiently supportive to expand the intervention to every school in a large district. Meehl (Citation1978) similarly emphasized the importance of using relevant theory and other background knowledge as part of any assessment process and not to rely solely on formal statistical comparison. As with original findings, we believe context issues such as theory, measurement validity, and relevant previous findings all need to play a vital role in interpreting replication success.

Interestingly, researchers have spent considerable time debating the flip-side of this coin, the definitions of “failed” replications. Again, the issue of context is especially relevant to this discussion, as one can usually explain away a failed or mixed replication attempt as being due to differences in subtle context or hidden moderators. That may be the case, but the authors of the original study then bear responsibility for not adequately describing key contextual factors in the success of the original intervention. Indeed, an intervention whose success is so highly dependent on subtle variations of context is unlikely to be of great use to practitioners who attempt to use the intervention. Gelman (Citation2018) argues similarly, noting that “various concerns about the difficulty of replication should, in fact, be interpreted as arguments in favor of replication … if effects can vary by context, this provides more reason why replication is necessary for scientific progress” (p. 19, emphasis in original). In other words, context happens, and finding interventions that are useful across contextual differences (or, more to the point, are useful only within certain, definable contexts) is a goal of educational psychology.

In a related vein, some may wonder whether replications “fail” because the replicators simply are not as skilled as the researchers involved with the original study. But there is no evidence that replications fail because replicators lack experience or expertise (Nosek, Citation2020; Protzko & Schooler, Citation2020). Moreover, in an applied field like educational psychology, it is important to know whether special skill is required to elicit a particular effect. What if not all teachers or schools have this elusive trait? Should they expect effects or not? Those are the questions practitioners want and need answered. If researchers can develop even marginally informative practices, they would be immediately useful to educators who are experiencing initiative fatigue.

There is no magic number for “enough” replications

Assessing the need for replication is not about a particular replication rate, although we believe 1% is too low. Rather than focusing on replications rates, other criteria must be considered. If one were to ask how many people should be on anti-cholesterol medication, the answer would not be a percentage of the population. Instead, the answer would be based on the proportion of individuals who met specified criteria (e.g. age, cholesterol levels, family history of cardiovascular disease) associated with problems and benefits relevant to anti-cholesterol medication. The percentage of individuals who meet that criteria may vary over time and place, but it is those contextual factors that determine appropriateness and value. Similarly, Irvine (Citation2021) argued, "as the current state of knowledge informs what counts as a good replication, what counts as a good replication can change" (p. 8). For these reasons, we find a percent target for educational psychology to be a less-than-ideal lens through which to assess appropriateness of quantity of replications.

Rather, we favor a focus on the use of the following criteria for determining whether studies should be replicated. From an empirical perspective, highly-cited studies should be prioritized for replication. If a study on a new construct or intervention is cited several dozen or several hundred times in its first year after publication, those citations can be interpreted as the field voting with its feet, so to speak, about the importance of the study. Regarding qualitative evidence, a set of research findings that are about to be scaled up for broad implementation or are being included in textbooks and course materials should be targeted for replication.

Viewed conceptually, a replication adds value if it helps build theory (Guest & Martin, Citation2021), narrow or assess a given theory (Irvine, Citation2021), or assess whether an explanation has particular boundaries on effects (van Rooij & Baggio, Citation2021). In particular, high rates of replication may be needed in studies based on weak theory, a lot of variability in results across contexts, or theory that predicts many complex factors affecting outcomes. In the end, the question about replication prevalence boils down to whether a field has replicated its most important studies or needs them as part of the cycle of theory development, rather than whether researchers have replicated a specific percent of the field’s empirical output.

Increased use of replication is feasible

Regarding more frequent use of replication within educational psychology, we need to “Make it so” (Picard, 2366). An ideal educational system is informed by research evidence that practitioners and policymakers are confident will lead to desired outcomes. We struggle to see any realistic path toward this goal that does not include greater use of replication in educational psychology research. The idea that science is self-correcting is well-established, but such correction does not occur magically. Science is self-correcting when scientists are self-correcting. Replication attempts help inform the practices in which we should have confidence and the practices that need correcting. Frequent replication attempts will help accomplish this more quickly than traditional, largely replication-free approaches to research.

To make replication more common within the field, we see two challenges of adaptation: people and systems. People challenges (like those mentioned above), often stem from individuals needing to admit there is a problem. Educational psychology is not alone in needing to change behaviors to live up to norms (see Vazire, Citation2018). Psychology, even fMRI research, suffers from reliability and replicability problems (Elliott et al., Citation2020). Another people challenge to replications is researchers giving undue trust and credit to original studies simply because they were published first (again, Gelman’s time-reversal heuristics).

System challenges that prevent greater replication prevalence include things as fundamental as the incentive structure within higher education and the broader domain of academic research (Mellor, Citation2021/this issue). What gets published and funded as well as what helps get people hired, promoted, and honored needs rethinking. Or perhaps more accurately, perceptions of what gets rewarded matter, and those perceptions are still largely misaligned with replication and other strategies for improving educational psychology research. Additionally, undergraduate and graduate methods courses—as well as courses for pre-service teachers and administrators—need to include replication and its importance. Another system-level step involves educational psychology journals and professional organizations adopting the Transparency and Openness Promotion (TOP) Guidelines (Nosek et al., Citation2015), which include standards on replication. And venerable groups within the field, such as Division 15 of the American Psychological Association, should offer awards for high-quality replication attempts (Gorgolewski et al., Citation2018).

The current discussion and debate about improved research methodology in educational psychology and other fields should in no way give the impression that adoption of open science methods is the sole solution for improving the field’s impact. The best studies, even with huge sample sizes, impeccable design and measurement, and cutting-edge analysis techniques, will not be useful if based on weak theoretical foundations. Theory matters (Gehlbach & Robinson, Citation2021/this issue; Smith et al., Citation2017; Wentzel, Citation2021/this issue), and having carefully constructed theories that build on theories and research of the past serve as the foundation for all we do as applied psychologists (Vartanian, Citation2017). The growing discussion of replication’s role in theory development and assessment (e.g. Guest & Martin, Citation2021; Irvine, Citation2021; van Rooij & Baggio, Citation2021) will likely have a major impact on replication use. For example, making decisions about when replications are needed and what types of replication should be used may be informed by what they contribute to the development of a particular theory.

Making gains on both people and system challenges to increase use of replication will likely occur in stages. Developing momentum behind changing cultural norms will require coalition-building and sustained effort of many within their research communities and departments. To achieve this culture change, several universities in the United Kingdom have jointly agreed to appoint research quality officers (Munafò, Citation2019), sending a clear message across institutions and stakeholder groups about the need to act collectively to increase research quality and impact. We see no reason why educational psychologists, and education researchers in general, cannot act so boldly.

Conclusion

Replication is a necessary cornerstone for effective scientific endeavors, yet explicit replications are too rare within educational psychology. In concert with other open science strategies, increased use of strategically planned and timed replications would improve the overall quality and value of educational psychology research. Of course, the field needs to explore several aspects of replication to help maximize its potential benefits, such as how replication efforts can best inform theory development, the extent to which replication can be applied to various forms of qualitative research, and how replication can be most effectively incentivized, among other topics. But as Leppink (Citation2017) noted in an examination of replication across all types of research methodologies, the goal of replication is to “allow us to work together toward stronger conclusions and implications for future research and practice” (p. 100). This point is well-taken, and we see it as a fitting goal for educational psychologists, too. By collaborating and working together to improve the quality of our research, educational psychologists will create positive outcomes for both research and practice in education and learning. Replication—in conjunction with strong theoretical foundations and the widespread use of other open science practices—will help us achieve this goal. These collaborations will help us understand existing work better, conduct future work more efficiently and effectively, and provide greater value to practitioners and policymakers.

Acknowledgment

The authors acknowledge and appreciate the constructive criticisms and suggestions for improvement provided by the guest editors and reviewers.

Notes

1 After we had written most of this article, we noted the unintended similarity in structure to Zwaan et al. (Citation2018), who used potential concerns about replication to organize their comments. Although our framing is unique, we acknowledge the overlap in general structure.

2 See, in particular, Table 1 and Figure 1 in Hüffmeier et al. (Citation2016).

3 In economics, this is called the tragedy of the commons: Replication may be a public good (benefitting all), but individuals may act as free riders, benefitting when others perform replications but not acting to conduct replications themselves.

References

  • Adelson, J. L., & Matthews, M. S. (2019). Gifted Child Quarterly’s commitment to transparency, openness, and research improvement. Gifted Child Quarterly, 63(2), 83–85. https://doi.org/10.1177/0016986218824675
  • Anczyk, A., Grzymała-Moszczyńska, H., Krzysztof-Świderska, A., & Prusak, J. (2019). The replication crisis and qualitative research in the psychology of religion. The International Journal for the Psychology of Religion, 29(4), 278–291. https://doi.org/10.1080/10508619.2019.1687197
  • Azoulay, P., Fons-Rosen, C., & Graff Zivin, J. S. (2019). Does science advance one funeral at a time? The American Economic Review, 109(8), 2889–2920. https://doi.org/10.1257/aer.20161574
  • Camerer, C. F., Dreber, A., Forsell, E., Ho, T.-H., Huber, J., Johannesson, M., Kirchler, M., Almenberg, J., Altmejd, A., Chan, T., Heikensten, E., Holzmeister, F., Imai, T., Isaksson, S., Nave, G., Pfeiffer, T., Razen, M., & Wu, H. (2016). Evaluating replicability of laboratory experiments in economics. Science, 351(6280), 1433–1436. https://doi.org/10.1126/science.aaf0918
  • Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T.-H., Huber, J., Johannesson, M., Kirchler, M., Nave, G., Nosek, B. A., Pfeiffer, T., Altmejd, A., Buttrick, N., Chan, T., Chen, Y., Forsell, E., Gampa, A., Heikensten, E., Hummer, L., Imai, T., … Wu, H. (2018). Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour, 2(9), 637–644. https://doi.org/10.1038/s41562-018-0399-z
  • Carter, E. C., Schönbrodt, F. D., Gervais, W. M., & Hilgard, J. (2019). Correcting for bias in psychology: A comparison of meta-analytic methods. Advances in Methods and Practices in Psychological Science, 2(2), 115–144. https://doi.org/10.1177/2515245919847196
  • Chambers, C. (2017). The seven deadly sins of psychology: A manifesto for reforming the culture of scientific practice. Princeton University Press.
  • Chhin, C. S., Taylor, K. A., & Wei, W. S. (2018). Supporting a culture of replication: An examination of education and special education research grants funded by the Institute of Education Sciences. Educational Researcher, 47(9), 594–605. https://doi.org/10.3102/0013189X18788047
  • Coalition for Psychology in Schools and Education. (2015). Top 20 principles from psychology for preK-12 teaching and learning. American Psychological Association.
  • Collins, H. M. (1985). Changing order: Replication and induction in scientific practice. University of Chicago Press.
  • Crüwell, S., van Doorn, J., Etz, A., Makel, M. C., Moshontz, H., Niebaum, J., Orben, A., Parsons, S., & Schulte-Mecklenbeck, M. (2019). 7 easy steps to open science: An annotated reading list. Zeitschrift Für Psychologie, 227(4), 237–248. https://econtent.hogrefe.com/doi/10.1027/2151-2604/a000387
  • Devezer, B., Nardin, L. G., Baumgaertner, B., & Buzbas, E. O. (2019). Scientific discovery in a model-centric framework: Reproducibility, innovation, and epistemic diversity. Plos One, 14(5), Article e0216125. https://doi.org/10.1371/journal.pone.0216125
  • Ebersole, C. R., Atherton, O. E., Belanger, A. L., Skulborstad, H. M., Allen, J. M., Banks, J. B., Baranski, E., Bernstein, M. J., Bonfiglio, D. B. V., Boucher, L., Brown, E. R., Budiman, N. I., Cairo, A. H., Capaldi, C. A., Chartier, C. R., Chung, J. M., Cicero, D. C., Coleman, J. A., Conway, J. G., … Nosek, B. A. (2016). Many Labs 3: Evaluating participant pool quality across the academic semester via replication. Journal of Experimental Social Psychology, 67, 68–82. https://doi.org/10.1016/j.jesp.2015.10.012
  • Elliott, M. L., Knodt, K. R., Ireland, D., Morris, M. L., Poulton, R., S. Sison, M. L, R., Moffitt, T. E., Caspi, A., & Hariri, A. R. (2020). What is the test-retest reliability of common task-fMRI measure? New empirical evidence and a meta-analysis. Psychological Science, 31(7), 792–806. https://doi.org/10.1177/0956797620916786
  • Eronen, M. I., & Bringmann, L. F. (2021). The theory crisis in psychology: How to move forward. Perspectives on Psychological Science. Advance online publication. https://doi.org/10.1177/1745691620970586
  • Fanelli, D. (2010). Do pressures to publish increase scientists' bias? An empirical support from US States Data. PLoS One, 5(4), e10271. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0010271
  • Fleming, J. I., Wilson, S. E., Hart, S. A., Therrien, W. J., & Cook, B. G. (2021/this issue). Open accessibility in education research: Enhancing the credibility, equity, impact, and efficiency of research. Educational Psychologist, 56(2), 110–121. https://doi.org/10.1080/00461520.2021.1897593
  • Gehlbach, H., & Robinson, C. D. (2021/this issue). From old school to open science: The implications of new research norms for educational psychology and beyond. Educational Psychologist, 56(2), 79–89. https://doi.org/10.1080/00461520.2021.1898961
  • Gelman, A. (2017). Beyond “power pose”: Using replication failures and a better understanding of data collection and analysis to do better science. https://statmodeling.stat.columbia.edu/2017/10/18/beyond-power-pose-using-replication-failures-better-understanding-data-collection-analysis-better-science/
  • Gelman, A. (2018). Don't characterize replications as successes or failures. The Behavioral and Brain Sciences, 41, e128–20. https://doi.org/10.1017/S0140525X18000638
  • Gernsbacher, M. A. (2018). Three ways to make replication mainstream. Behavioral and Brain Sciences, 41, 20. https://doi.org/10.1017/S0140525X1800064X
  • Gervais, W. M. (2021). Practical methodological reform needs good theory. Perspectives on Psychological Science. Advance online publication. https://doi.org/10.1177/1745691620977471
  • Gorgolewski, K. J., Nichols, T., Kennedy, D. N., Poline, J. B., & Poldrack, R. A. (2018). Making replication prestigious. Behavioral and Brain Sciences, 41, Article e131. https://doi.org/10.1017/S0140525X18000663
  • Guest, O., & Martin, A. (2021). How computational modeling can force theory building in psychological science. Perspectives on Psychological Science. Advance online publication. https://doi.org/10.1177/1745691620970585
  • Hedges, L. V., & Schauer, J. M. (2019). More than one replication study is needed for unambiguous tests of replication. Journal of Educational and Behavioral Statistics, 44(5), 543–570. https://doi.org/10.3102/1076998619852953
  • Hodges, C. (2015). Replication studies in educational technology. TechTrends, 59(4), 3–4. https://doi.org/10.1007/s11528-015-0862-2
  • Howe, P. D., & Perfors, A. (2018). An argument for how (and why) to incentivise replication. Behavioral and Brain Sciences, 41, 25–26. https://doi.org/10.1017/S0140525X18000705
  • Hüffmeier, J., Mazei, J., & Schultze, T. (2016). Reconceptualizing replication as a sequence of different studies: A replication typology. Journal of Experimental Social Psychology, 66, 81–92. https://doi.org/10.1016/j.jesp.2015.09.009
  • Irvine, E. (2021). The role of replication studies in theory building. Perspectives on Psychological Science. Advance online publication. https://doi.org/10.1177/1745691620970558
  • Jacob, R. T., Doolittle, F., Kemple, J., & Somers, M.-A. (2019). A framework for learning from null results. Educational Researcher, 48(9), 580–589. https://doi.org/10.3102/0013189X19891955
  • Kanai, R. (2016). Open questions in conducting confirmatory replication studies: Commentary on Boekel et al., 2015. Cortex; a Journal Devoted to the Study of the Nervous System and Behavior, 74, 343–347. https://doi.org/10.1016/j.cortex.2015.02.020
  • Klein, R. A., Ratliff, K. A., Vianello, M., Adams, R. B., Bahník, Š., Bernstein, M. J., Bocian, K., Brandt, M. J., Brooks, B., Brumbaugh, C. C., Cemalcilar, Z., Chandler, J., Cheong, W., Davis, W. E., Devos, T., Eisner, M., Frankowska, N., Furrow, D., Galliani, E. M., … Nosek, B. A. (2014). Investigating variation in replicability: a “many labs” replication project. Social Psychology, 45(3), 142–152. https://doi.org/10.1027/1864-9335/a000178
  • Kochari, A. R., & Ostarek, M. (2018). Introducing a replication-first rule for PhD projects (commmentary on Zwaan et al., ‘Making replication mainstream’). Behavioral and Brain Sciences, 41, 28. https://doi.org/10.1017/S0140525X18000730
  • Kuhn, T. S. (2012). The structure of scientific revolutions (4th ed.). University of Chicago Press.
  • Lemons, C. J., King, S. A., Davidson, K. A., Berryessa, T. L., Gajjar, S. A., & Sacks, L. H. (2016). An inadvertent concurrent replication. Same roadmap, different journey. Remedial and Special Education, 37(4), 213–222. https://doi.org/10.1177/0741932516631116
  • Leppink, J. (2017). Revisiting the quantitative-qualitative-mixed methods labels: Research questions, developments, and the need for replication. Journal of Taibah University Medical Sciences, 12(2), 97–101. https://doi.org/10.1016/j.jtumed.2016.11.008
  • Machery, E. (2019, October 10). What is a replication? https://doi.org/10.31234/osf.io/8x7yn
  • Makel, M. C. (2014). The empirical march: Making science better at self-correction. Psychology of Aesthetics, Creativity, and the Arts, 8(1), 2–7. https://doi.org/10.1037/a0035803
  • Makel, M. C., & Plucker, J. A. (2014a). Creativity is more than novelty: Reconsidering replication as a creative act. Psychology of Aesthetics, Creativity, and the Arts, 8(1), 27–29. https://doi.org/10.1037/a0035811
  • Makel, M. C., & Plucker, J. A. (2014b). Facts are more important than novelty: Replication in the education sciences. Educational Researcher, 43(6), 304–316. https://doi.org/10.3102/0013189X14545513
  • Makel, M. C., & Plucker, J. A. (2015). An introduction to replication research in gifted education: Shiny and new is not the same as useful. Gifted Child Quarterly, 59(3), 157–164. https://doi.org/10.1177/0016986215578747
  • Makel, M. C., & Plucker, J. A. (Eds.). (2017). Toward a more perfect psychology: Improving trust, accuracy, and transparency in research. American Psychological Association.
  • Makel, M. C., Plucker, J. A., Freeman, J., Lombardi, A., Simonsen, B., & Coyne, M. (2016). Replication of special education research: Necessary but far too rare. Remedial and Special Education, 37(4), 205–212. https://doi.org/10.1177/0741932516646083
  • Makel, M. C., Plucker, J. A., & Hegarty, C. B. (2012). Replications in psychology research: How often do they really occur? Perspectives on Psychological Science A Journal of the Association for Psychological Science, 7(6), 537–542. https://doi.org/10.1177/1745691612460688
  • McNeeley, S., & Warner, J. J. (2015). Replication in criminology: Necessary practice. European Journal of Criminology, 12(5), 581–597. https://doi.org/10.1177/1477370815578197
  • Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46(4), 806–834. https://doi.org/10.1037/0022-006X.46.4.806
  • Mellor, D. (2021/this issue). Improving norms in research culture to incentivize transparency and rigor. Educational Psychologist, 56(2), 122–131. https://doi.org/10.1080/00461520.2021.1902329
  • Merk, S., & Rosman, T. (2019). Smart but evil? student-teachers’ perception of educational researchers’ epistemic trustworthiness. AERA Open, 5(3). https://doi.org/10.1177/2332858419868158
  • Merk, S., & Rosman, T. (2020). Reflections on the registered report process for “Smart but evil? Student-teachers’ perception of educational researchers’ epistemic trustworthiness. AERA Open, 6(2). https://doi.org/10.1177/2332858420918158
  • Munafò, M. (2019). Raising research quality will require collective action. Nature, 576(7786), 183. https://doi.org/10.1038/d41586-019-03750-7
  • National Science Foundation & Institute of Educational Sciences, U.S. Department of Education. (2018). A supplement to the common guidelines for education research and development. https://ies.ed.gov/pdf/CompanionGuidelinesReplicationReproducibility.pdf
  • Nosek, B. A. [@BrianNosek]. (2020, November 13). Summary press release [Twitter thread]. https://twitter.com/BrianNosek/status/1327296776865525761
  • Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., Buck, S., Chambers, C. D., Chin, G., Christensen, G., Contestabile, M., Dafoe, A., Eich, E., Freese, J., Glennerster, R., Goroff, D., Green, D. P., Hesse, B., Humphreys, M., … Yarkoni, T. (2015). SCIENTIFIC STANDARDS. Promoting an open research culture. Science, 348(6242), 1422–1425. https://doi.org/10.1126/science.aab2374
  • Nosek, B. A., & Errington, T. M. (2019, September 10). What is replication?. https://doi.org/10.1371/journal.pbio.3000691
  • Nosek, B. A., Hardwicke, T. E., Moshontz, H., Allard, A., Corker, K. S., Dreber, A., Fidler, F., Hilgard, J., Kline, M., Nuijten, M. B., Rohrer, J., Romero, F., Scheel, A., Scherer, L., Schönbrodt, F., & Vazire, S. (2021). Replicability, robustness, and reproducibility in psychological science [preprint]. PsyArXiv. https://psyarxiv.com/ksfvq/download?format=pdf
  • Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), Article aac4716. https://doi.org/10.1126/science.aac4716
  • Pashler, H., & Harris, C. R. (2012). Is the replicability crisis overblown? Three arguments examined. Perspectives on Psychological Science: A Journal of the Association for Psychological Science, 7(6), 531–536. https://doi.org/10.1177/1745691612463401
  • Patall, E. A. (2021/this issue). Implications of the open science era for educational psychology research syntheses. Educational Psychologist, 56(2), 142–160. https://doi.org/10.1080/00461520.2021.1897009
  • Peters, S. J., Rambo-Hernandez, K., Makel, M. C., Matthews, M., & Plucker, J. A. (2019). The effect of local norms on racial and ethnic representation in gifted education. AERA Open, 5(2). https://doi.org/10.1177/2332858419848446
  • Piffer, D. (2019). Evidence for recent polygenic selection on educational attainment and intelligence inferred from Gwas hits: A replication of previous findings using recent data. Psych, 1(1), 55–75. https://doi.org/10.3390/psych1010005
  • Planck, M. K. (1949). Scientificautobiography and other papers. Philosophical Library.
  • Plucker, J. A., Beghetto, R. A., & Dow, G. T. (2004). Why isn't creativity more important to educational psychologists? Potentials, pitfalls, and future directions in creativity research. Educational Psychologist, 39(2), 83–96. https://doi.org/10.1207/s15326985ep3902_1
  • Pratt, M. G., Kaplan, S., & Whittington, R. (2020). Editorial essay: The tumult over transparency: Decoupling transparency from replication in establishing trustworthy qualitative research. Administrative Science Quarterly, 65(1), 1–19. https://doi.org/10.1177/0001839219887663
  • Pridemore, W. A., Makel, M. C., & Plucker, J. A. (2018). Replication in criminology and the social sciences. Annual Review of Criminology, 1(1), 19–38. https://doi.org/10.1146/annurev-criminol-032317-091849
  • Protzko, J., & Schooler, J. W. (2020). No relationship between researcher impact and replication effect: An analysis of five studies with 100 replications. PeerJ., 8, e8014. https://doi.org/10.7717/peerj.8014
  • Reich, J. (2021/this issue). Preregistration and registered reports. Educational Psychologist, 56(2), 101–109. https://doi.org/10.1080/00461520.2021.1900851
  • Reich, J., Gehlbach, H., & Albers, C. (2020, May). AERA Open special topic on preregistered reports. AERA Open. https://journals.sagepub.com/page/ero/collections/registered-reports
  • Rosenthal, R. (1969). On not so replicated experiments and not so null results. Journal of Consulting and Clinical Psychology, 33(1), 7–10. https://doi.org/10.1037/h0027231
  • Schauer, J. M., & Hedges, L. V. (2021). Reconsidering statistical methods for assessing replication. Psychological Methods, 26(1), 127–139. https://doi.org/10.1037/met0000302
  • Schindler, C., Veja, C., Hocker, J., Kminek, H., & Meier, M. (2020). Collaborative open analysis in a qualitative research environment. Education for Information, 36(3), 215–247. https://doi.org/10/3233/EFI-190261
  • Schlosberg, H. (1951). Repeating fundamental experiments. American Psychologist, 6(5), 177–177. https://doi.org/10.1037/h0056148
  • Schmidt, S. (2009). Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology, 13(2), 90–100. https://doi.org/10.1037/a0015108
  • Schmidt, S. (2017). Replication. In M. C. Makel & J. A. Plucker (Eds.), Toward a more perfect psychology: Improving trust, accuracy, and transparency in research (pp. 215–232). American Psychological Association. https://doi.org/10.1037/0000033-015
  • Simmons, J., Nelson, L. (2020, May 20). Do human-like products inspire more holistic judgments? http://datacolada.org/87
  • Simons, D. J. (2014). The value of direct replication. Perspectives on Psychological Science, 9(1), 76–80. https://doi.org/10.1177/1745691613514755
  • Simons, D. J., Shoda, Y., & Lindsay, D. S. (2017). Constraints on Generality (COG): A proposed addition to all empirical papers. Perspectives on Psychological Science: A Journal of the Association for Psychological Science, 12(6), 1123–1128. https://doi.org/10.1177/1745691617708630
  • Simonsohn, U. (2015). Small telescopes: detectability and the evaluation of replication results. Psychological Science, 26(5), 559–569. https://doi.org/10.1177/0956797614567341
  • Simonton, D. K. (2012). Taking the U.S. Patent Office criteria seriously: A quantitative three-criterion creativity definition and its implications. Creativity Research Journal, 24(2–3), 97–106. https://doi.org/10.1080/10400419.2012.676974
  • Smerdon, D., Hu, H., McLennan, A., von Hippel, W., & Albrecht, S. (2020). Female chess players show typical stereotype-threat effects: Commentary on Stafford (2018). Psychological Science, 31(6), 756–759. https://doi.org/10.1177/0956797620924051
  • Smith, J. K., Smith, L. F., & Smith, B. K. (2017). The reproducibility crisis in psychology: Attack of the clones or phantom menace? In M. C. Makel & J. A. Plucker (Eds.), Toward a more perfect psychology (pp. 273–287). American Psychological Association.
  • Spector, J. M., Johnson, T. E., & Young, P. A. (2015). An editorial on replication studies and scaling up efforts. Educational Technology Research and Development, 63(1), 1–4. https://doi.org/10.1007/s11423-014-9364-3
  • Spellman, B. A. (2015). A short (Personal) Future History of Revolution 2.0. Perspectives on Psychological Science: A Journal of the Association for Psychological Science, 10(6), 886–899. https://doi.org/10.1177/1745691615609918
  • Stafford, T. (2018). Female chess players outperform expectations when playing men. Psychological Science, 29(3), 429–436. https://doi.org/10.1177/0956797617736887
  • Stafford, T. [@TomStafford]. (2020, May 20). Is stereotype threat in chess real after all? [Twitter thread] https://twitter.com/tomstafford/status/1263013074556071936
  • Steinhardt, I. (2020). Learning open science by doing open science. A reflection of a qualitative research project-based seminar. Education for Information (preprint). https://content.iospress.com/download/education-for-information/efi190308?id=education-for-information%2Fefi190308
  • Stroebe, W., & Strack, F. (2014). The alleged crisis and the illusion of exact replication. Perspectives on Psychological Science: A Journal of the Association for Psychological Science, 9(1), 59–71. https://doi.org/10.1177/1745691613514450
  • Tierney, W., Hardy, J. H., Ebersole, C. R., Leavitt, K., Viganola, D., Clemente, E. G., Gordon, M., Dreber, A., Johannesson, M., Pfeiffer, T., Hiring Decisions Forecasting Collaboration., & Uhlmann, E. L. (2020). Creative destruction in science. Organizational Behavior and Human Decision Processes, 161, 291–309. https://doi.org/10.1016/j.obhdp.2020.07.002
  • Turner, J. C., & Nolen, S. B. (2015). Introduction: The relevance of the situative perspective in educational psychology. Educational Psychologist, 50(3), 167–172. https://doi.org/10.1080/00461520.2015.1075404
  • Valentine, J. (2019). Expecting and learning from null results. Educational Researcher, 48(9), 611–613. https://doi.org/10.3102/0013189X19891440
  • van Rooij, I., & Baggio, G. (2021). Theory before the test: How to build high-verisimilitude explanatory theories in psychological science. Perspectives on Psychological Science. Advance online publication. https://doi.org/10.1177/1745691620970604
  • Vartanian, O. (2017). The contributions of theory choice, cumulative science, and problem finding to scientific innovation and research quality. In M. C. Makel & J. A. Plucker (Eds.), Toward a more perfect psychology: Improving trust, accuracy, and transparency in research (pp. 13–31). American Psychological Association. https://doi.org/10.1037/0000033-002
  • Vazire, S. (2018). Implications of the credibility revolution for productivity, creativity, and progress. Perspectives on Psychological Science, 13(4), 411–417. https://doi.org/10.1177/1745691617751884
  • Wentzel, K. R. (2021/this issue). Open science reforms: Strengths, challenges, and future directions. Educational Psychologist, 56(2), 161–173. https://doi.org/10.1080/00461520.2021.1901709
  • Williams, R. T., Polanin, J. R., & Pigott, T. D. (2017). Meta-analysis and reproducibility. In. M. C. Makel & J. A. Plucker (Eds.), Toward a more perfect psychology (pp. 255–270). American Psychological Association.
  • Yeo-Teh, N. S. L., & Tang, B. L. (2021). Research ethics courses as a vaccination against a toxic research environment or culture. Research Ethics, 17(1), 55–65. https://doi.org/10.1177/1747016120926686
  • Zwaan, R. A., Etz, A., Lucas, R. E., & Donnellan, M. B. (2018). Making replication mainstream. Behavioral and Brain Sciences, 41, e120. https://doi.org/10.1017/S0140525X17001972