1,015
Views
0
CrossRef citations to date
0
Altmetric
Review

A reanalysis of the Institute for Research and Evaluation report that challenges non-US, school-based comprehensive sexuality education evidence base

ORCID Icon, ORCID Icon & ORCID Icon

Abstract

Comprehensive sexuality education (CSE) prepares young people to make informed decisions about their sexuality. A review by the Institute of Research and Evaluation that analysed 43 CSE studies in non-US settings found the majority to be ineffective and concluded that there was little evidence of the effectiveness of CSE. We reanalysed the review to investigate its validity. We found several weaknesses with the review’s methodology and analysis: (1) there was an absence of a clearly articulated search strategy and specific eligibility criteria; (2) the authors put forth criteria for programme effectiveness but included studies that did not collect the data needed to show programme effectiveness and thus several studies were determined to be ineffective by default; (3) the analytical framework minimised positive intervention effects and privileged negative intervention effects; and (4) there were errors in the data extracted, with 74% of studies containing one or more discrepancies. Overall, our reanalysis reveals that the IRE review suffers from significant methodological flaws and contains many errors which compromise its conclusions about CSE. Our reanalysis is a tool for the international community to refute CSE opposition campaigns based on poor science.

Résumé

L’éducation complète à la sexualité (ECS) prépare les jeunes à prendre des décisions éclairées concernant leur sexualité. Un examen de l’Institut de recherche et d’évaluation (IRE) qui a analysé 43 études sur l’ECS dans des pays autres que les États-Unis d’Amérique a constaté que la majorité des interventions étaient inopérantes et a conclu que l’efficacité de l’ECS n’était guère établie. Nous avons réanalysé l’examen pour évaluer sa validité. Nous avons trouvé plusieurs faiblesses dans la méthodologie de l’examen et l’analyse: (1) une stratégie de recherche clairement articulée et des critères d’éligibilité spécifiques faisaient défaut; (2) les auteurs ont proposé des critères d’efficacité des programmes, mais ont inclus des études qui n’avaient pas recueilli les données nécessaires pour montrer l’efficacité du programme et, en conséquence, plusieurs études avaient été jugées inefficaces par défaut; (3) le cadre analytique a minimisé les effets positifs et privilégié les répercussions négatives des interventions; et (4) les données extraites comportaient des erreurs, 74% des études contenant une ou plusieurs divergences. Dans l’ensemble, notre analyse révèle que l’examen de l’IRE souffre d’importants défauts méthodologiques et contient beaucoup d’erreurs qui compromettent ses conclusions sur l’ECS. Notre nouvelle analyse est un outil qui permet à la communauté internationale de réfuter les campagnes d’opposition à l’ECS fondées sur des données scientifiques médiocres.

Resumen

La educación integral en sexualidad (EIS) prepara a las personas jóvenes para tomar decisiones informadas sobre su sexualidad. Una revisión por el Instituto de Investigación y Evaluación analizó 43 estudios de EIS en entornos fuera de EE. UU., encontró que la mayoría no era eficaz y concluyó que existía poca evidencia de la eficacia de la EIS. Reanalizamos la revisión para investigar su validez. Encontramos varios puntos débiles de la metodología y el análisis de la revisión: (1) carecía de una estrategia de búsqueda expresada claramente y de criterios de elegibilidad específicos; (2) los autores plantearon los criterios para la eficacia del programa, pero incluyeron estudios que no recolectaron los datos necesarios para mostrar la eficacia del programa, por lo cual se determinó que varios estudios eran ineficaces por omisión; (3) el marco analítico minimizó los efectos positivos y privilegió los efectos negativos de la intervención; y (4) había errores en los datos extraídos: el 74% de los estudios contenían una o más discrepancias. Proporcionamos un reanálisis exhaustivo de la revisión de la EIS por el Instituto, la cual tenía defectos graves en su metodología y estaba llena de desinformación. Nuestro reanálisis es una herramienta que la comunidad internacional puede utilizar para refutar las campañas de oposición contra la EIS basadas en mala ciencia.

Introduction

Comprehensive sexuality education (CSE) is defined as “a curriculum-based process of teaching and learning about the cognitive, emotional, physical, and social aspects of sexuality”.Citation1 CSE prepares children and young people to make informed and responsible decisions about their sexuality and lead healthy and safe lives. There is a robust evidence base and decades of programmatic learning that support CSE and its benefits for children and young people. A strong example of this is presented in UNESCO’s International Technical Guidance on Sexuality Education, Volume 1 (2009) where 87 studies were evaluated and nearly all were found to increase knowledge, while more than one fourth improved two or more sexual behaviours.Citation2 Goldfarb and Lieberman’s systematic review found support for CSE across a range of topics and ages including outcomes such as improved social/emotional learning and prevention of child sex abuse.Citation3 Despite the need for CSE and the growing body of evidence in support of its effectiveness, many young people do not receive CSE. This leaves them without proper understanding of their reproductive and sexual health and with dangerous knowledge gaps.

Why CSE is not being implemented despite the large body of evidence in support of its effectiveness is in part explained by the deep-seated opposition this issue has garnered over the years. While not new, opposition to CSE has increasingly become more organised and visible.Citation4 Contrary to the prevailing evidence, opponents assert not only that CSE is ineffective, but that it can cause “long-term negative effects on health”.Citation5 In 2019, the Institute for Research and Evaluation (IRE), a group that identifies as “a non-profit research agency”, published a report titled “Re-Examining the Evidence for Comprehensive Sex Education in Schools: A Global Research Review” which reviewed the existing evidence on CSE and claimed to have found little evidence for the effectiveness of CSE.Citation6,Citation7 This report was published on the IRE website and in Issues in Law & Medicine; the former served as the source for our reanalysis.Citation7,Citation8 The IRE report reanalysed 43 international studies included in three authoritative reviews on CSE, including the UNESCO technical guidance, and concluded that the current evidence base does not support the effectiveness or acceptability of CSE.Citation2,Citation7 The authors reportedly found high rates of programme failure and harmful effects, and few cases of success.

Findings from the report are being used to advocate among UN member states against school-based CSE and have generated confusion and doubts about the state of the evidence on CSE. Recently, the report was used as grounds to oppose the renewal of the Eastern and Southern Africa Ministerial Commitment.Citation9,Citation10 The IRE report conclusions have been used as the basis for advocacy efforts against CSE programmes through many outlets, including expert testimonies to US legislative bodies.Citation6,Citation11

This paper reanalyses the evidence included in the IRE report to investigate the validity of its conclusions and the extent to which they reflect the body of evidence on CSE. Our manuscript is not a systematic review of CSE programmes in international settings nor is it an evidence synthesis of CSE programme impacts. Instead, it is a reanalysis and assessment of the IRE report that seeks to investigate the rigour of the methods employed in that report and the accuracy of the conclusions reached, in an attempt to explain why the report’s findings and conclusions depart from those of well established reviews that have undergone peer-review.

We have also published a commentary in the Journal of Adolescent Health (JAH) that summarises overall findings presented in this paper and discusses the implications for CSE.Citation12 Unlike our JAH commentary, this manuscript provides a detailed critique of the IRE report with an in-depth explanation of our reanalysis methods and findings. Together, this manuscript and the JAH commentary complement one another in the reanalysis of the IRE report.

Our reanalysis had four objectives relating to the following four domains:

  1. Analytical framework: To evaluate whether the analytical framework was appropriate and consistently applied.

  2. Study selection and inclusion criteria: To determine whether the study selection process and inclusion criteria were based on standards typical of reviews.

  3. Accuracy of analysis: To assess if the analysis was done correctly and reflected the analytical framework used.

  4. IRE Report conclusions: To determine if the report’s conclusions flowed logically from the results.

Methods

We retrieved 42 (of a total of 43) non-US CSE studies evaluated in the IRE international report. The US and non-US studies were separate analyses performed by the IRE. We focused on the non-US CSE studies, given that the context of CSE in lower- and middle-income countries is quite different than in the US. For our reanalysis, we used an adapted version of the AMSTARFootnote* criteria as an organising framework.Citation13 We, the authors of this manuscript, would like to acknowledge our affiliations with CSE; we work at institutions that support CSE and are advocates of evidence-based research.

To address the four objectives listed previously, the following steps were taken:

Objective 1 (Analytical framework)

Reviews rely on collecting outcome data from the included studies to assess the effect of interventions. An analytical framework helps define the nature of the data that will be collected and provides definitions of measures of intervention effect. The IRE report used an analytical framework that helped them define programme success. We extracted their analytical framework; the operational definitions of the indicators included in the framework; their methods for collecting, coding, and reporting the data; and the process they used to determine overall programme effect. We assessed whether the analytical framework was appropriate and coherent and used it to inform our study reanalysis.

Objective 2 (Study selection and inclusion criteria)

We attempted to reproduce the IRE search strategy by referring to the three existing research reviews the authors cited in the report as forming the basis of their study selection: (1) UNESCO’s International Technical Guidance on Sexuality Education, Volume 1;Citation2 (2) the CDC Community Preventive Services Task Force’s “The Effectiveness of Group-Based Comprehensive Risk Reduction and Abstinence Education Interventions to Prevent or Reduce the Risk of Adolescent Pregnancy, HIV, and STIs”;Citation14 and (3) US Department of Health and Human Services “Teen Pregnancy Prevention (TPP) Evidence Review” database.Citation15 If we did not successfully find a publication in one of the three reviews, we used other academic search engines.

We then extracted the study inclusion criteria and compared them against the PICOT (population, intervention, comparison, outcome, and time) elements of the research question. These are standard components of a research question that inform pre-specified eligibility criteria upon which studies are included or excluded.Citation16,Citation17

To assess whether the review used a comprehensive and reproducible search strategy that reduces risk of selection bias, we extracted and replicated their search strategy. We also explored whether the authors conducted a critical assessment of the included studies (such as quality or risk of bias assessmentsCitation18,Citation19) and whether these informed their analysis and presentation of results.

Objective 3 (Accuracy of analysis)

Data were extracted from each study and classified in accordance with the IRE analytical framework. To ensure accuracy, data extraction were conducted independently by the first and second author and their results were compared and compiled. Any discrepancies that arose in the data extraction were resolved by discussion.

We organised the results of the reanalysis into a data table similar to the one included in the IRE report to aid in comparison. We compared the data and recorded discrepancies between the IRE report and our study reanalysis. Then we performed a discrepancy analysis to quantify the scale of discrepancies. We calculated two measures: (1) percentage of studies that contained one or more discrepancies; and (2) percentage of indicators that were coded incorrectly. We also evaluated each study for overall programme effect based on the IRE analytical framework.

Objective 4 (IRE report conclusions)

We compared the conclusions in the IRE report against the data presented in their report to determine if their conclusions flowed logically from their results.

Results

Analytical framework

The IRE report relied on an analytical framework that defined intervention effect in terms of 10 indicators: (1) four “key protective” indicators, (2) four “less protective” indicators and (3) two “other” indicators. presents these indicators (column 1) and the IRE report definitions of these indicators (column 2). The classification of these indicators was used to determine each study’s overall effect.

Table 1. Indicator definitions and measurements used.

The designation of indicators as key protective indicators vs less protective indicators was not justified in the report nor was it based on evidence from prevention research that privileges some indicators over others. The report additionally lacked definitions for the included indicators: out of ten indicators, we were able to locate definitions for only two within the report. The remaining indicators were not explicitly defined, which made it difficult to reanalyse the IRE report. For example, the “unprotected sex” indicator can be defined as protection from pregnancy, from sexually transmitted infections (including HIV), or both. There is clear overlap between these interpretations and the “any condom use” indicator. UNAIDS now uses the term “condomless sex” in place of “unprotected sex” to avoid confusion with the protection from pregnancy that is provided by other means of contraception.Citation20 Although we would define unprotected sex as sexual intercourse without any form of contraception, to align closely to the IRE report’s methodology we did not include condom use data for this indicator because: (1) the IRE included two other indicators that capture condom use data and (2) IRE condom use findings (continued or any condom use) did not align with their unprotected sex findings, which we took as a sign that they excluded condom use from this indicator’s definition. Column 3 in summarises the study measurements we used for the undefined indicators in our study reanalysis. The absence of clear definitions and the ambiguity of some indicators made it difficult to interpret the results of the IRE report and to establish what data were collected for each indicator.

The authors used the 10 indicators to establish whether a CSE programme was deemed effective or ineffective. To be classified as producing a positive effect, the following criteria had to be met:Citation7

  • Produce a positive effect on entire study population (not just a subgroup); AND

  • Produce a positive effect at or beyond 12 months post-programme; AND

  • Produce a positive effect on one of the four key protective indicators

A CSE programme that produced a positive effect for one subgroup and not the entire population did not meet the criteria. Similarly, a programme that reduced “frequent/recent sex” and increased “any condom use” was still not classified as producing a positive effect because these were classified as less protective indicators.

In contrast, to be classified as producing a negative effect, the following criteria had to be met:Citation7

  • Negative effect on any substantial subgroup of study population; OR

  • Negative effect for any duration; OR

  • Negative effect on any indicator

Programmes with both positive indicator effects and negative indicator effects were classified as producing an overall negative programme effect. In addition, indicators that were outside those specified in the framework were only considered if they produced negative effects, whereas non-framework indicators for which programmes produced positive effects were not considered. For example, if a programme increased coerced sex, paid sex, or forced sex, the overall programme was classified as producing a negative effect.Citation21,Citation22 While this may be considered a reasonable classification, the same treatment was not afforded to programmes that showed positive effects for indicators that fell outside of the framework. For example, Díaz et al reported a significant increase in current use of modern contraceptive methods; however, the indicator was considered outside of the framework and so the whole study was not classified as positive.Citation23 This inconsistency in approach favoured negative effects and suppressed positive programme effects.

The IRE report included a third classification of studies which they called studies showing “evidence of programme potential”. These were programmes that failed to meet the three positive effect criteria mentioned above but that (1) produced a positive effect among a subgroup; or (2) produced a positive effect for less than 12 months; or (3) produced a positive effect for one of the less protective indicators. However, while the authors did acknowledge that such programmes showcased potential, they still classified them as “failures” in their conclusions about overall CSE programme effect. Further, six indicator measurements which were classified as evidence of programme potential actually demonstrated a positive effect for the entire study population, for more than 12 months post-programme.Citation22,Citation24–26 However, because these positive effects were for one of the less protective indicators, they were not classified as positive.

Study selection and eligibility criteria

Instead of articulating a clear search strategy, which would follow good practice, the IRE report authors stated that they relied on three existing reviews to inform their study selection. The CDC review included only US-based studies, which were outside the scope of the IRE analysis of non-US studies. UNESCO’s technical guidance included a total of 45 articles on CSE in non-US school-based settings.Citation2 While there was no clear indication as to which review document each study report was extracted from, we found that 19 of the 45 eligible articles in the UNESCO review were included in the IRE report. The US Department of Health’s Teen Pregnancy Prevention (TPP) database included 14 articles on CSE in non-US school-based settings;Citation15 only three of the 14 were included in the IRE’s report. The rest of the studies were not included in the IRE report. Two of the three articles were found in both the UNESCO technical guidance and the TPP database. There was thus a total of 23 studies from the IRE report that were not found in either review. Appendix 1 details which studies were found in the UNESCO review and TPP database. Of the 43 studies included in the report, we were only able to retrieve 42 study reports due to lack of citations.

It was unclear from the IRE Report how study selection took place, and why some articles were included while others were not. The authors did not document the search process in sufficient detail, which hampered our attempt to reproduce their search. With regard to the rationale for including in the IRE report the studies not included in the two reviews, the authors did not describe (1) their search strategy; (2) whether they followed established protocols for selecting studies (such as screening titles and/or abstracts and conducting full-text screenings of potentially eligible study reports); nor (3) who determined study eligibility and how discrepancies were resolved if more than one person was involved in the process. Given the polarised environment of CSE research, we did not reach out to the authors for additional information on their search strategy.

We also found that there was an absence of clearly articulated eligibility criteria which should have reflected the review’s overarching research question. The authors defined a successful programme as one producing “effects sustained at least 12 months after the programme on a key protective indicator”.Citation7 The time element of their PICOT research question is thus defined by the 12 months metric and should accordingly form the basis of their inclusion criteria. Yet, many studies that were included did not take measurements at or beyond 12 months. We found that only 21 of the included study reports (50%) took measurements at or beyond 12 months. Of the remaining studies, 16 did not collect measurements at or beyond 12 months and five did not specify a follow-up period. Accordingly, 50% of the studies were ineligible, rendering them ineffective by default according to the authors’ definition of programme effectiveness. Although these studies should have been automatically disqualified from inclusion by the IRE’s analytical framework, we still evaluated these “disqualified” studies in our reanalysis to provide a thorough analysis across all objectives. Moreover, the outcomes that the authors pre-specified – one of four key protective indicators that are listed in – were missing from two of the included studies. Those two were again rendered ineffective by default.

Additionally, the authors included all eligible studies without evidence of carrying out any quality assessments or appraisals. There are multiple ways to conduct eligibility and quality assessments; different authors of CSE systematic reviews have made decisions to include or exclude studies of low quality.Citation27,Citation28 However, the IRE authors did not describe any sort of assessment. The inclusion of all eligible studies without consideration for quality or risk of bias may very well have introduced systematic bias. We found serious heterogeneity and variation in the quality of the included studies and several of the studies had serious flaws. For example, some low-quality studies had smaller sample sizes or were purely descriptive without employing robust statistical tests. These low-quality studies were given equal weight in the IRE review to the higher quality studies. There was no effort made to either exclude them from the analysis or present a stratified analysis that accounted for study quality variation. Neither was there a narrative discussion of risk of bias anywhere in the IRE report.

Accuracy of analysis

In the previous section, we pointed to limitations of the analytical framework put forth by the IRE. After applying the IRE framework to the body of articles included, we further identified several errors in the way indicators and studies were classified. There were gross discrepancies between the IRE report and our reanalysis in how studies were classified overall, with the IRE report identifying one additional study with negative effects and three fewer studies with positive effects. The IRE report classified indicators as producing (1) a negative effect; (2) a positive effect; (3) evidence of programme potential; (4) both positive and negative effects; (5) not significant (NS) effects; or (6) not measured (NM).Citation7 Their data tables suffer from a lack of important details around effect size and statistical significance, which are critical elements of any rigorous analysis.

Our reanalysis mirrored their approach, gathering data in relation to negative effects, the four key protective indicators, the four less protective indicators, and dual benefit. We compared our findings to those presented in the IRE data table and generated a detailed report of all discrepancies between our study reanalysis and the IRE report (Appendix 2). We found that a total of 31 of the 42 (73.8%) study reports contained one or more errors.

In both the IRE report and our study reanalysis, the overall programme effect was determined by considering all 10 indicator findings and by applying the IRE analytical framework. As aforementioned, programmes with both positive indicator effects and negative indicator effects were classified by the IRE analytical framework as producing a negative overall programme effect. The IRE analytical framework only considered the overall outcome of a programme as positive or negative. To provide a more nuanced analysis, we also noted overall programme effects demonstrating evidence of overall programme potential in our study replication; the IRE did not assess this.

The overall programme effects found in our study reanalysis were compared to those found in the IRE report. This comparison is summarised in . The IRE report found that three programmes produced an overall positive effect, while we found six. Additionally, the IRE report found that eight programmes produced a negative effect, while our study reanalysis found seven. In 19 of the studies, we found evidence of programme potential (which they defined as a programme that produced positive effects either among a subgroup, for less than 12 months, or for one of the less protective indicators). These 19 programmes were entirely overlooked in the IRE analysis despite producing positive effects, as they failed to meet all the criteria for programme success outlined by the IRE.

Table 2. Overall programme effect comparison.

IRE report conclusions

The IRE report conclusions about international CSE programmes were compared to the IRE report data on programme effect to determine whether conclusions accurately reflected the data presented. Note, this comparison did not take into consideration the discrepancies we identified in the IRE report. Overall, we found that the report’s conclusions about the number of programmes that produced positive or negative impacts aligned with the data they presented. However, we found errors in the conclusions they made about individual indicators, including continued condom use (CCU) and any condom use. The IRE report conclusions claimed no programmes showed an increase in CCU “for any period of time or any subgroup”. However, the IRE data table presents programmes that have shown increases.Citation29 Additionally, the IRE report conclusions state only one programme showed an increase in any condom use. However, the IRE report found 10 study reports that demonstrated evidence of programme potential (the best classification for a less protective indicator within the IRE framework);Citation25,Citation26,Citation30–37 five of these measurements were taken for the entire study population.Citation25,Citation30,Citation31,Citation33,Citation36

In the overall conclusions of the report, the authors state that “of the 43 studies that evaluated 39 school-based CSE programmes outside the United States, three programmes produced positive impact 12 months after the program, on a key protective outcome, for the intended population”.Citation7 Later in the report, they acknowledge that of the 43 studies, only 27 actually took measurements for a key protective indicator, at least 12 months post-programme, and for the intended population, which is how they defined success in their analytical framework. According to our reanalysis, 27 is also incorrect. Only 19 study reports were on programmes that took measurements at or beyond 12 months and on key protective indicators, and three of these were for the same programme (Mathews et al), further reducing the total number of programmes satisfying these criteria to 17. These 17 studies should have been the only studies included in the analysis and should have formed the denominator for their success rates, rather than 43. According to our study reanalysis, five of these 17 programmes (29% success rate) produced a positive impact 12 months after the programme, on a key protective outcome, and for the intended population, whereas the authors concluded that they found a very high failure rate of 89%.Citation7 The 89% percent rate grossly misrepresents the overall evidence, also because many of the studies were not true “failures” and actually showed evidence of programme potential according to the IRE report’s own classification. They just did not meet the stringent definition of success that the IRE used.

Discussion

This manuscript set out to examine the validity of a recent review of CSE programmes undertaken by the Institute for Research and Evaluation.Citation7 We reanalysed the report and assessed four aspects of the review’s methodology and findings: (1) analytical framework; (2) study selection and inclusion; (3) accuracy of findings; and (4) overall conclusions. Our reanalysis revealed several inconsistencies and errors in each of the four elements examined that together cast doubt on the validity of the report’s conclusions about CSE programmes in non-US settings.

With respect to the analytical framework, we found several issues with the way data were interpreted and weighed. For one, the framework lacked crucial operational definitions, which complicated our reanalysis attempts. Second, the framework privileged some indicators over others without justifying this with evidence from prevention research or behaviour change literature. This minimised positive effects and exaggerated negative effects of interventions. Moreover, no indicators that measured changes in knowledge and attitudes, which can serve as proxy for behaviour change in some cases, were included.Citation38–41 In fact, 88% of studies that were included in the review measured knowledge and/or attitudes, yet none of this evidence was featured in the report.

Changes in knowledge and attitudes may or may not impact practices, depending on the broader environments in which individuals are nested – a reality that the authors of the IRE report failed to acknowledge when interpreting their results.Citation42 For example, while CSE programmes may advance young people’s knowledge of safe sex, they may fail to translate into behavioural gains if young people lack access to confidential services, if contraceptive methods are not readily available, or if supportive laws and policies that enable access to contraceptives are absent. In other words, it is important to see CSE as one element in a package of interventions that includes building the knowledge and skills of young people, investing in their social networks and assets, and providing a supportive environment that includes safe and confidential health services.Citation43 In reality, many adolescents and young people find that the systems and institutions around them are not geared towards meeting their needs. We would suggest that it is unrealistic to expect interventions that focus on improving adolescents’ knowledge and attitudes alone to be silver bullets if the broader environment is not conducive to behaviour change, but to exclude improvements in knowledge and attitudes as positive outcomes of an intervention is limited.

We found that the IRE report did not adhere to standards typical of scientific reviews. In terms of study selection, the report did not provide important bibliographic information for the 43 studies that were included which prevented retrieval of the full list of studies. There was an absence of a clearly articulated search strategy and lack of documentation of the exact criteria used to determine study eligibility for inclusion. Moreover, the authors included studies that did not gather data at 12 months, which was a requirement for programme success, as well as studies that did not collect data on the indicators that are needed to show programme effectiveness.

In terms of accuracy of findings, the IRE’s analysis of studies contained many errors, according to our reanalysis, as demonstrated by our finding that 74% of studies contained one or more discrepancies. Finally, the IRE report’s conclusions did not entirely align with the data they presented and inaccurately portrayed the collective body of evidence that they examined.

Taken together, our findings indicate that the IRE analysis falls short of meeting the scientific standards necessary to inform recommendations on CSE programmes.

The findings of the IRE report depart markedly from those of other reviews conducted on the topic of CSE effectiveness, which underwent a peer-review process that verified their validity and rigour. Several systematic reviews of CSE interventions have been produced over the past two and a half decades that investigate and confirm the effectiveness of CSE. For example, in 1997, Grunseit et al found that out of 47 studies that evaluated CSE programmes, 17 studies delayed sexual activity, decreased number of sex partners, or reduced unplanned pregnancy and rates of sexually transmitted diseases.Citation44 In 2007, Kirby et al reviewed 83 studies which assessed the impact of sexuality and HIV education on practices among young people and found that the majority of programmes significantly improved at least one sexual behaviour; they delayed or decreased sexual behaviours or increased condom or contraceptive use.Citation45 Another review conducted in 2007 by Underhill et al evaluated the impact of abstinence-plus interventions on HIV prevention and found that out of 39 trials, 23 had a positive effect on at least one sexual behaviour.Citation46 In 2014, Fonner et al meta-analysed 33 studies of school-based CSE interventions to evaluate their efficacy in changing HIV-related knowledge and risk behaviours and found that students who received CSE interventions reported significantly higher condom use, fewer sexual partners, and lower rates of sexual initiation.Citation47 Most recently, in 2021, Goldfarb and Lieberman’s systematic review of 80 studies that evaluated school-based CSE programmes over the past three decades found CSE programmes improved several outcomes including reductions in reports of dating violence (51 studies) and increased effectiveness of child sex abuse prevention (16 studies).Citation3

Strengths and limitations

Our reanalysis has several strengths. First, to assess the review’s methodology and reporting, we used evidence-based guidance which outlines essential practices in the conduct of scientific reviews.Citation48,Citation49 Second, two researchers independently conducted data extraction and analysis which minimised errors in reporting and reduced subjectivity. One limitation was our inability to retrieve all 43 studies included in the IRE report. This is because the authors did not provide complete and accurate bibliographic information. Nonetheless, we made serious efforts to analyse this study to the fullest extent and were able to locate 42 out of the 43 study reports. A second limitation is that we did not include independent quality assessments of the studies. This was done to maintain focus on reanalysing the IRE report which did not carry out quality assessments, but remains a significant limitation of both analyses. A third limitation is that we did not use alternative methods or frameworks to synthesise our findings. This was done, again, in an effort to stay faithful to the methods employed by the IRE in the spirit of repeating their analysis. It is critical that our study reanalysis data are not taken as an independent systematic review that reflects the CSE evidence base; our manuscript should solely be considered an analysis and critique of the IRE report. We note as a group of authors that we support the use of CSE, and the polarised nature of debate in the field meant that we were not able to contact the IRE authors for clarification where information in the report was uncertain.

Conclusion

Our re-review was undertaken to assess the methodology and findings of the IRE Report on CSE, which concludes that there is little evidence of CSE’s effectiveness. As researchers in this field, we found this conclusion at odds with our own experience and knowledge of the evidence base. A clear body of evidence and extensive programmatic experience speak to the benefits of CSE for children and young people. Despite lack of adherence to standards of scientific rigour, the IRE’s report has been shared through news outlets and used by other anti-CSE organisations as grounds to oppose the renewal of policies on CSE programmes.Citation9,Citation10,Citation50 The IRE is actively working with UN member states to influence policymaking and decisions around funding allocation to CSE. The IRE’s claims that the state of the evidence discredits CSE underscore the need for independent and rigorous reanalysis, a cornerstone of the scientific process. Our reanalysis sheds light on the extent of errors in the methods used by the IRE and the inaccuracies of their findings which together compromise the validity of the report’s overall conclusions against the effectiveness of CSE.

Implications and contribution

This manuscript is the first to provide a thorough analysis of CSE misinformation research and will be a critical tool for the international community to refute CSE opposition campaigns that are becoming better organised and resourced.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by World Health Organization [Human Reproduction Programme Trust Fund].

Notes

* A MeaSurement Tool to Assess systematic Reviews

References

  • Herat J, Plesons M, Castle C, et al. The revised international technical guidance on sexuality education – a powerful tool at an important crossroads for sexuality education. Reprod Health. 2018;15(1):185, doi:10.1186/s12978-018-0629-x
  • UNESCO. International technical guidance on sexuality education: an evidence-informed approach for schools, teachers and health educators. Published online December 2009.
  • Goldfarb ES, Lieberman LD. Three decades of research: the case for comprehensive sex education. J Adolesc Health. 2021;68(1):13–27. doi:10.1016/j.jadohealth.2020.07.036
  • School-Based Sexuality Education: The Issues and Challenges. Guttmacher Institute. Published June 15, 2005. [cited 2021 September 5]. Available from: https://www.guttmacher.org/journals/psrh/1998/07/school-based-sexuality-education-issues-and-challenges.
  • Family Watch International. Stop CSE: 15 Harmful Elements of CSE. StopCSE.org. [cited 2023 June 1]. Available from: https://www.comprehensivesexualityeducation.org/15-harmful-elements-of-cse/.
  • About Us. The Institute for Research & Evaluation. Published n.d. [cited 2021 October 6]. Available from: https://www.institute-research.com/aboutus.php?menu=m6.
  • Ericksen IH, Weed SE. Re-examining the evidence for comprehensive sex education in schools: a global research review. Inst Res Eval. 2019;23. Available from: https://www.institute-research.com/CSEReport/Global_CSE_Report_12-17-19.pdf.
  • Weed S, Ericksen I. Re-examining the evidence for school-based comprehensive sex education: a global research review. Issues Law Med. 2019;34(2):161–182. https://issuesinlawandmedicine.com/product/ericksen-re-examining-the-evidence-for-school-based-comprehensive-sex-education/). (https://pubmed.ncbi.nlm.nih.gov/33950605/
  • Family Watch International. Devious ESA Commitment.; 2021 [cited 2021 December 5]. Available from: https://familywatch.org/2021/12/01/devious-esa-commitment/#.Ya7CEL5KiM8.
  • No author. CSE Research. Devious ESA Commitment. Published No date. [cited 2021 December 6]. Available from: https://deviousesacommitment.org/cse-research/.
  • Is ‘Comprehensive Sex Education’ effective – as its proponents claim? Daily Citizen. Published September 30, 2021. [cited 2021 October 19]. Available from: https://dailycitizen.focusonthefamily.com/is-comprehensive-sex-education-effective-as-its-proponents-claim/
  • VanTreeck K, Elnakib S, Chandra-Mouli V. Flaws and errors identified in the Institute for Research and Evaluation report that challenges non-U.S., school-based comprehensive sexuality education evidence base. J Adolesc Health. 2023;72(3):332–333. doi:10.1016/j.jadohealth.2022.11.244
  • AMSTAR. AMSTAR Checklist. AMSTAR. [cited 2022 May 6]. Available from: amstar.ca/Amstar_Checklist.php.
  • Chin HB, Sipe TA, Elder R, et al. The effectiveness of group-based comprehensive risk-reduction and abstinence education interventions to prevent or reduce the risk of adolescent pregnancy, human immunodeficiency virus, and sexually transmitted infections. Am J Prev Med. 2012;42(3):272–294. doi:10.1016/j.amepre.2011.11.006
  • U.S. Department of Health and Human Services. Teen pregnancy prevention evidence review: study search. youth.gov. Published October 2016. [cited 2021 October 6]. Available from: https://tppevidencereview.youth.gov/StudyDatabase.aspx.
  • Thomas J, Kneale D, McKenzie JE, et al. Determining the scope of the review and the questions it will address. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA, editors. Cochrane handbook for systematic reviews of interventions. Hoboken (NJ): John Wiley; 2019. p. 13–31. doi:10.1002/9781119536604.ch2
  • McKenzie JE, Brennan SE, Ryan RE, et al. Defining the criteria for including studies and how they will be grouped for the synthesis. Wiley Online Library. Published September 20, 2019. [cited 2021 Sep 12]. Available from: https://onlinelibrary.wiley.com/doi/abs/10.10029781119536604.ch3.
  • Higgins JP, Savović J, Page MJ, et al. Assessing risk of bias in a randomized trial. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA, editors. Cochrane handbook for systematic reviews of interventions. Hoboken (NJ): John Wiley; 2019. p. 205–228. doi:10.1002/9781119536604.ch8
  • Kennedy CE, Fonner VA, Armstrong KA, et al. The evidence project risk of bias tool: assessing study rigor for both randomized and non-randomized intervention studies. Syst Rev. 2019;8(1):3. doi:10.1186/s13643-018-0925-0
  • Joint United Nations Programme on HIV/AIDS. UNAIDS Terminology Guidelines. UNAIDS; 2015. Accessed July 21, 2023. https://www.unaids.org/sites/default/files/media_asset/2015_terminology_guidelines_en.pdf
  • Jewkes R, Nduna M, Levin J, et al. Impact of stepping stones on incidence of HIV and HSV-2 and sexual behaviour in rural South Africa: cluster randomised controlled trial. Br Med J. 2008;337:a506. doi:10.1136/bmj.a506
  • Visser M. HIV/AIDS prevention through peer education and support in secondary schools in South Africa. SAHARA-J J Soc Asp HIVAIDS. 2007;4(3):678–694. doi:10.1080/17290376.2007.9724891
  • Díaz M, Mello Md, Sousa Md, et al. Outcomes of three different models for sex education and citizenship programs concerning knowledge, attitudes, and behavior of Brazilian adolescents. Cad Saúde Pública. 2005;21(2):589–597. doi:10.1590/S0102-311X2005000200026
  • Jemmott JB, Jemmott LS, O’Leary A, et al. HIV/STI risk-reduction intervention efficacy with South African adolescents over 54 months. Health Psychol. 2015;34(6):610–621. doi:10.1037/hea0000140
  • Magnani R, MacIntyre K, Karim AM, et al. The impact of life skills education on adolescent sexual risk behaviors in KwaZulu-Natal, South Africa. J Adolesc Health. 2005;36(4):289–304. doi:10.1016/j.jadohealth.2004.02.025
  • Maticka-Tyndale E, Wildish J, Gichuru M. Thirty-month quasi-experimental evaluation follow-up of a national primary school HIV intervention in Kenya. Sex Educ. 2010;10(2):113–130. doi:10.1080/14681811003666481
  • Hindin MJ, Kalamar AM. Detailed methodology for systematic reviews of interventions to improve the sexual and reproductive health of young people in low- and middle-income countries. J Adolesc Health. 2016;59(3):S4–S7. doi:10.1016/j.jadohealth.2016.07.009
  • Levy JK, Darmstadt GL, Ashby C, et al. Characteristics of successful programmes targeting gender inequality and restrictive gender norms for the health and wellbeing of children, adolescents, and young adults: a systematic review. Lancet Glob Health. 2020;8(2):e225–e236. doi:10.1016/S2214-109X(19)30495-4
  • Cartagena RG, Veugelers PJ, Kipp W, et al. Effectiveness of an HIV prevention program for secondary school students in Mongolia. J Adolesc Health Off Publ Soc Adolesc Med. 2006;39(6):925.e9–16. doi:10.1016/j.jadohealth.2006.07.017
  • Ajuwon AJ, Brieger WR. Evaluation of a school-based reproductive health education program in rural south western, Nigeria. Afr J Reprod Health. 2007;11(2):47–59. doi:10.2307/25549715
  • Harvey B, Stuart J, Swan T. Evaluation of a drama-in-education programme to increase AIDS awareness in South African high schools: a randomized community intervention trial. Int J STD AIDS. 2000;11:105–111. doi:10.1177/095646240001100207
  • James S, Reddy P, Ruiter RAC, et al. The impact of an HIV and AIDS Life Skills program on secondary school students in KwaZulu–Natal, South Africa. AIDS Educ Prev. 2006;18(4):281–294. doi:10.1521/aeap.2006.18.4.281
  • Merakou K, Kourea-Kremastinou J. Peer education in HIV prevention: an evaluation in schools. Eur J Public Health. 2006;16(2):128–132. doi:10.1093/eurpub/cki162
  • Okonofua FE, Coplan P, Collins S, et al. Impact of an intervention to improve treatment-seeking behavior and prevent sexually transmitted diseases amc ‘W Nigerian youths. Int J Infect Dis. 2003;7(1):61–73. doi:10.1016/s1201-9712(03)90044-0
  • Stanton BF, Li X, Kahihuata J, et al. Increased protected sex and abstinence among Namibian youth following a HIV risk-reduction intervention: a randomized, longitudinal study. AIDS. 1998;12(18):2473–2480. doi:10.1097/00002030-199818000-00017
  • Taylor M, Jinabhai C, Dlamini S, et al. Effects of a teenage pregnancy prevention program in KwaZulu-Natal, South Africa. Health Care Women Int. 2014;35(7–9):845–858. doi:10.1080/07399332.2014.910216
  • Duflo E, Dupas P, Kremer M, et al. Education and HIV/AIDS prevention: evidence from a randomized evaluation in Western Kenya. World Bank Policy Research Paper No. 4024. 2006: 1–33.
  • DiClemente R. Review: reducing adolescent sexual risk: a theoretical guide for developing and adapting curriculum-based programs. Published online October 18, 2011.
  • Bettinghaus EP. Health promotion and the knowledge-attitude-behavior continuum. Prev Med. 1986;15(5):475–491. doi:10.1016/0091-7435(86)90025-3
  • Haberland N. The case for addressing gender and power in sexuality and HIV education: a comprehensive review of evaluation studies. Int Perspect Sex Reprod Health. 2015;41(1):31–42. doi:10.1363/4103115
  • Haberland N, Rogow D. Sexuality education: emerging trends in evidence and practice. J Adolesc Health. 2015;56(1):S15–S21. doi:10.1016/j.jadohealth.2014.08.013
  • Gallant M, Maticka-Tyndale E. School-based HIV prevention programmes for African youth. Soc Sci Med. 2004;58(7):1337–1351. doi:10.1016/S0277-9536(03)00331-9
  • Organization WH. WHO recommendations on adolescent sexual and reproductive health and rights. Geneva (CH): World Health Organization; 2018.
  • Grunseit A, Kippax S, Aggleton P, et al. Sexuality education and young people’s sexual behavior: a review of studies. J Adolesc Res. 1997;12(4):421–453. doi:10.1177/0743554897124002
  • Kirby DB, Laris BA, Rolleri LA. Sex and HIV education programs: their impact on sexual behaviors of young people throughout the world. J Adolesc Health Off Publ Soc Adolesc Med. 2007;40(3):206–217. doi:10.1016/j.jadohealth.2006.11.143
  • Underhill K, Operario D, Montgomery P. Systematic review of abstinence-plus HIV prevention programs in high-income countries. PLoS Med. 2007;4(9):e275. doi:10.1371/journal.pmed.0040275
  • Fonner VA, Armstrong KS, Kennedy CE, et al. School based sex education and HIV prevention in low- and middle-income countries: a systematic review and meta-analysis. Vermund SH, ed. PLoS ONE. 2014;9(3):e89692. doi:10.1371/journal.pone.0089692
  • Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Syst Rev. 2021;10(1):89. doi:10.1186/s13643-021-01626-4
  • Higgins JP, Thomas J, Chandler J, et al. Cochrane handbook for systematic reviews of interventions. Hoboken (NJ): John Wiley; 2019.
  • Johnston J. Is “Comprehensive Sex Education” effective – as its proponents claim? Focus on the Family’s The Daily Citizen. Published September 30, 2021. [cited 2021 October 27] Available from: https://dailycitizen.focusonthefamily.com/is-comprehensive-sex-education-effective-as-its-proponents-claim/.

Appendices

Appendix 1: Studies found in UNESCO and TPP review documents

Table A1: Study source list

Appendix 2: Detailed discrepancy report

Table B1: Discrepancy summary