2,681
Views
7
CrossRef citations to date
0
Altmetric
Original Articles

Impaired validity of the new FIGO and Swedish CTG classification templates to identify fetal acidosis in the first stage of labor

, &
Pages 4853-4860 | Received 21 Aug 2020, Accepted 23 Dec 2020, Published online: 06 Jan 2021

Abstract

Introduction

Cardiotocography (CTG) is the main method of intrapartum fetal surveillance. In 2015 a new guideline was introduced by the International Federation of Gynecology and Obstetrics (FIGO), FIGO-15. In Sweden it was adjusted to SWE-17, replacing the previous national template, SWE-09. This study, conducted at one university hospital and one regional hospital in southern Sweden, evaluated the diagnostic validity of these three templates to detect fetal acidosis during the first stage of labor.

Material and methods

A total of 73 neonates with pH <7.1 in umbilical cord artery or vein at cesarean delivery during the first stage of labor were identified retrospectively. For each acidotic neonate, three non-acidemic neonates, with a pH ≥7.2 in cord artery and vein, and Apgar scores ≥9 at five and ten minutes, in all 219 neonates, were selected. The CTG tracings before birth in acidemic neonates, and tracings at the same cervical dilatation in the non-acidemic neonates, were independently assessed by three professionals from the obstetric staff, blinded to group and clinical data. Based on their categorizations of the included variables (baseline, variability, accelerations, decelerations and contraction rate), each CTG tracing was systematically classified according to the three templates. The sensitivity and specificity to identify acidemia by the classification pathological were determined for each template. Interobserver agreement in the assessments of tracings as pathological or not was analyzed, using free-marginal Kappa index.

Results

The sensitivity for patterns classified as pathological to identify acidemia was similar for FIGO-15 (71%) and SWE-17 (77%, p = .13), and the specificity was 97% for both. SWE-09 had a significantly higher sensitivity (95%, p < .001) albeit with a lower specificity (90%, p < .001) than the other two templates. Among acidemic neonates, the fraction of tracings classified as normal was higher with SWE-17 (9.6%) than with SWE-09 (0%; p = .01) and FIGO-15 (1.4%; p = .06). For tracings from neonates with acidemia, agreement for three independent assessors was strong (κ 0.85) with SWE-09, and weak for FIGO-15 (κ 0.47), and SWE-17 (κ 0.51). For tracings from neonates without acidemia, the agreement was almost perfect for FIGO-15 (κ 0.91), strong withSWE-17 (κ 0.90) and moderate with SWE-09 (κ 0.78).

Conclusions

The ability of FIGO-15 and SWE-17 to identify fetal acidosis is considered insufficient. The combination of a high sensitivity and a high specificity makes SWE-09 the most discriminatory template during the first stage of labor.

Introduction

Cardiotocography (CTG) is the gold standard for intrapartum fetal surveillance even though randomized trials show modest benefits [Citation1–4]. The primary goal of reducing the burden of perinatal mortality and long-term sequels while avoiding unnecessary obstetric intervention has proven difficult to achieve [Citation4–8]. Still, large population based studies have indicate that the use of electronic fetal monitoring (EFM) is associated with decreased early neonatal mortality and morbidity [Citation9].

Many efforts have been made to improve the efficacy of EFM, resulting in national guidelines with some variations in interpretation of the different CTG variables.

In 2015 the International Federation of Gynecology and Obstetrics (FIGO), published a new guideline on how to use CTG with a new interpretation template, FIGO-15 [Citation10]. It was the first update from FIGO since 1987 [Citation11]. The main objective behind this revision was to increase the effectiveness of EFM by creating objective definitions of the different features of CTG and increase intra- and interobserver agreement. The new template is a three-tier system, created by an international consensus panel [Citation10].

In 2015 a national template was in use in Sweden since 2009, SWE-09 [Citation12]. It was a version of the FIGO template from 1987 [Citation11] adjusted to accommodate the fetal ECG ST analysis (STAN) algorithm from 2007 [Citation13]. A modification of the FIGO-15 was introduced in Sweden during 2017, SWE-17 [Citation14], and is now the national standard interpretation template. The SWE-17 was introduced in a wish to adhere to the new international guideline, and with the consideration that a low specificity for the SWE-09 template might cause unnecessary interventions. In Denmark, it was decided to await evidence before adopting the new FIGO guideline [Citation15].

All three templets include the parameters baseline fetal heart rate, variability and decelerations. The SWE-09 also include the parameters accelerations and contraction rate. The two new templates accept a wider range of normal baseline heart rate than SWE-09 and differ in their classification of decelerations in that they stipulate a minimum frequency and duration for most decelerations in order to be deemed pathological. The three templates are summarized in .

Figure 1. Criteria of the three interpretation templates for the different classifications.

Figure 1. Criteria of the three interpretation templates for the different classifications.

Since neither of these templates had been evaluated before being taken in clinical use, and since CTG patterns are often markedly different in the first and second stage of labor, we planned two studies evaluating the sensitivity and specificity in the first and second stage of labor, respectively. The present study focuses on the first stage.

During the first stage of low-risk labor, Swedish guidelines recommend intermittent CTG monitoring every second hour and auscultation every 15–30 min in between [Citation14], which has been shown to be as safe as continuous fetal monitoring in low-risk labor [Citation16]. If the tracing is not classified as normal, extended monitoring is recommended. Hence, from low-risk labors with normal fetal heart rates, relatively short CTG tracings are recorded in the first stage of labor. Continuous CTG is recommended in high-risk labor [Citation14].

The primary objective of this study was to compare the templates, SWE-09, FIGO-15 and SWE-17, regarding sensitivity and specificity in identifying acidosis during the first stage of labor.

The null hypothesis was that sensitivity and specificity do not differ between the three classification systems.

Materials and methods

This study includes intrapartum CTG tracings recorded at Helsingborg Hospital, between March 13th 2012 and December 31st 2016 and at Skåne University Hospital, between April 23d2013 and October 31st 2017, Region Skåne, Sweden.

Neonates born by emergency cesarean section during the first stage of labor and having an umbilical cord arterial or venous pH <7.10 were identified. The first stage of labor was defined as regular contractions and a cervical dilation of three to nine centimeters. Further inclusion criteria included singleton pregnancy, gestational age ≥34 + 0 weeks and an available CTG tracing >15 min prior to delivery, with a maximum delay of 30 min from the end of the CTG tracing and delivery.

Normally, fetal pH is higher during the first stage of labor than at birth, since fetal pH declines and lactate increases during the second stage of labor [Citation17,Citation18]. A cord artery pH <7.10 has been associated with an increased risk of adverse neurological outcomes [Citation19] and was therefore the cutoff used in this study, as indicator of exposure to hypoxia.

For each neonate born with acidemia, the first three neonates born after the acidemic neonate at the same unit and fulfilling the inclusion criteria were included. Inclusion criteria for the non-acidemic group were singleton birth ≥34 + 0 weeks, pH ≥7.20 in cord artery and vein, and Apgar score ≥9 at five and ten minutes. Furthermore, the non-acidemic neonates had to have an available CTG tracing for at least 15 min at the same cervical dilation as its corresponding acidemic neonate. Thus, non-acidemic neonates were matched to the acidemic neonates for the evaluation of CTG at the same cervical dilation, i.e. only including tracings from the first stage.

A total of 57,582 neonates were born at the two hospitals during the study period. Of 1470 neonates, born with an umbilical cord arterial or venous pH <7.10, 126 (8,6%) were delivered by cesarean in the first stage of labor. The study included 73 acidemic neonates and 219 non-acidemic neonates fulfilling the inclusion criteria.

Background data are shown in . The acidemic group included more high-risk pregnancies. There were three preterm neonates, born at 34 + 5, 36 + 1 and 36 + 5 weeks, all in the non-acidemic group.

Table 1. Summary of background data of acidemic neonates, pH <7.1 in umbilical cord artery or vein, after delivery with cesarean section, and in non-acidemic neonates with pH ≥7.2 in both cord vessels and Apgar scores ≥9 at five and ten minutes.

CTG tracings between 15 and 80 min were assessed and saved as anonymous PDF-files. The short lower limit of 15 min was chosen because of the use of intermittent CTG during first stage of labor. The graph scale used was one centimeter per minute. Clinical data was gathered in Excel®.

The median duration of the CTG tracings in the acidemic group was 70 min and six tracings (8%) were <30 min. In the non-acidemic group, the median duration was 40 min and 68 tracings (31%) were <30 min. The total range was 16 to 80 min in both groups.

In 20 (27%) of the acidemic neonates, scalp blood sampling had been performed within 30 min before the end of the CTG-tracing. Of those, 14 had lactate values above the cutoff for intervention (4.8 mmol/L) for the device Lactate Pro® used during the study period. The cutoff represented the 75th percentile in a study by Kruger et al. [Citation20].

The anonymized tracings were numbered and arranged by computer randomization. Each tracing was interpreted by three assessors (midwives and physicians) for whom the interpretation of CTG was part of their daily work at a labor ward. The only information available to the assessors was that the CTG tracings were from singleton pregnancies, from the first stage of labor and that all neonates were born with either a low or a normal cord blood pH.

The assessors completed forms with their interpretation of the different CTG parameters individually (Supplement 1 and 2).

The final classification according to each template, utilizing the interpretation of the included parameters made by the assessors, was then made by one of the authors (FE). The final classification was according to the majority. If the comprehensive CTG class differed between all three interpreters, a senior obstetrician made a fourth interpretation of the variables to receive a majority interpretation. Thus, this interpretation worked as a casting vote.

The assessors were blinded to group and other clinical variables, as were the authors until the analyses were complete. The SWE-09 is a four-tier template. The fourth classification is preterminal, a loss of variability. In the study pathological and preterminal were merged to allow statistical comparisons with the two three-tier templates. For the final analysis, a CTG classified as pathological was considered a positive test, whereas a normal or suspicious CTG was considered as a negative test

Statistical analyses

Stat View® computer software (SAS Institute, version 5.0.1; Cary, NC) was used to gather and analyze the data. Sensitivity and specificity with 95% confidence intervals (CI) were calculated, using www.sample-size.net/confidence-interval-proportion provided by the University of California, San Francisco. A two-sided McNemar’s test was also used to determine the statistical significance of differences in classification of the CTG tracings with the different templates. A p-value <.05 was considered as statistically significant.

For agreement, a free-marginal kappa index for multiple raters was calculated according to J. Randolph, using an online Calculator, www.justusrandolph.net/kappa and McHugh [Citation21] was used for classification of kappa index.

Ethical approval

The Regional Ethical Review Board in Lund gave ethical approval, Dnr 2016/371, 2016-05-24.

Results

The final classifications are summarized in , and the sensitivity and specificity for the three templates in . The classification pathological in SWE-09 had a significantly higher sensitivity for identifying acidotic neonates (94.5%) than both SWE-17 (76.7%; p = .0009), and FIGO-15 (71.2%; p = .0001). The specificity was significantly lower for SWE-09 (90.0%) than for the other two templates (96.9% for both; FIGO-15; p = .0007 and SWE-17; p = .0003).

Table 2. Classification of 292 cardiotocograms (CTG) according to three different classification templates (see footnote for explanation).

Table 3. Comparison of sensitivity and specificity for the different templates to identify neonatal acidosis in the first stage of labor.

For tracings from acidemic neonates, agreement between the three assessors was strong for SWE-09 (κ 0.85), and weak with the new templates, FIGO-15 (κ 0.47), and SWE-17 (κ 0.51). For the non-acidemic group the agreement was almost perfect for FIGO-15 (κ 0.91), strong with SWE-17 (κ 0.90), and moderate for SWE-09 (κ 0.78), .

Table 4. Comparison of inter-observer agreement in classifications of CTG patterns between three independent interpreters using the same classification chart.

The most common patterns among acidemic neonates that were classified as pathological with SWE-09 but not with the two other templates were combinations of: complicated variable decelerations (n = 13), combined decelerations (n = 9), uniform late decelerations (n = 5), tachycardia (n = 9), decreased variability (n = 6) and lack of accelerations (n = 10).

In the acidemic group, the fraction of tracings classified as normal was higher with SWE-17 (9.6%) than with SWE-09 (0%; p = .01) and FIGO-15 (1.4%; p = .06).

In the group of non-acidemic neonates a higher fraction of the traces was classified as pathological with SWE-09 (10.0%) than with the SWE-17 and FIGO-15 (3.2%; p < .01 for both). The most common pattern only classified as pathological with SWE-09 was complicated variable decelerations (n = 11).

Discussion

For safe intrapartum care, an interpretation template for fetal monitoring should identify a high rate of neonates with acidosis, especially during the first stage of labor. In our study, SWE-09 demonstrated the highest sensitivity (95%) for the classification pathological. FIGO-15 and SWE-17 showed lower sensitivity to detect acidotic neonates (71%, p = .0001 and 77%, p = .0009 respectively) and therefore, do not provide safe guidance if classification pathological is the only class used to indicate fetuses at risk for whom intervention should be considered.

The higher rate of patterns classified as normal in acidotic cases (9.6%) is also a matter of concern for SWE-17, since a normal classification can lead to that the CTG-recording is terminated, leaving a fetus already exposed to oxygen deficit without continuous surveillance.

When CTG is used as a screening method, with an option of fetal scalp blood sampling as a secondary test if the pattern is pathological, the specificity for the template is not as important as the sensitivity, since the secondary test will increase specificity. Moreover, since the purpose of intrapartum fetal monitoring is to prevent rather than to predict fetal acidosis, a specificity for fetal acidosis close to 100% may not be achievable. We consider that the specificity of 90% for pathological patterns with SWE-09 may indicate a clinically useful level.

The new templates had higher interobserver agreement in tracings from non-acidemic neonates, but lower interobserver agreement for acidemic neonates. This is in line with the results for sensitivity and specificity, and we speculate that SWE-09 provides better tools to identify abnormality, whereas FIGO-15 and SWE-17 may include better tools to identify normality.

The probability in identifying an abnormal CTG pattern is probably lower during a short than during a long CTG tracing. The shorter median duration of CTG tracings for non-acidemic neonates than for acidemic neonates might have led to higher specificity than if the tracings had been of identical duration. This, however, would affect all templates and not the comparison between the templates.

In the group of acidemic neonates a time gap of up to 30 min between the end of the CTG tracing and delivery was allowed. It is possible that some acidemic neonates might have become acidotic during this gap, which would have decreased sensitivity for all templates.

Three previous studies, comparing FIGO-15 to older templates are published. Olofsson et al. compared the FIGO-15, the SWE-17 and the STAN interpretation algorithm, similar to SWE-09, and found discrepancies in the classification of CTG with the three templates [Citation22] and that FIGO-15 has a lower sensitivity than the STAN interpretation algorithm [Citation23]. Marti Gamboa et al also found a low sensitivity for FIGO-15, similar to the five-tier system by Parer and Ikeda [Citation24]. Our study confirms the low sensitivity for FIGO-15, and is to our knowledge the first to compare the new FIGO template with other templates specifically during the first stage of labor.

The results of the present study and the three previous, support holding on to previous templates based on the FIGO guidelines from 1987 [Citation11], until solid evidence for new guidelines have been presented.

A national change in Sweden during 2017 to a template with a lower sensitivity might be reflected in national data. The annual report from the Swedish Pregnancy Register, reported an increase in the rate of 5-min Apgar scores <7, from 1.0% in 2016 to 1.2% in 2019 [Citation25]. The Swedish Pregnancy Register covers over 90% of Swedish deliveries, and has been validated [Citation26].

Due to the worrying increase in low Apgar scores, we retrieved data from the Swedish Pregnancy Register regarding 5-min Apgar scores in term live births, elective cesarean deliveries excluded, for 2014–2016 (n = 264,181), and 2018–2019 (n = 180,120). In these cohorts, the rate of 5-min Apgar scores <4 increased from 0.18% during 2014–2016 to 0.25% during 2018–2019 (p < .0001). Whether this has to do with the change of classification template or with other factors is unknown and must be further studied. The rate of emergency cesarean sections among live term births, elective cesareans excluded, was similar during 2014–2016 (9.1%), and 2018–2019 (8.9%; p = .06), whereas instrumental deliveries decreased from 6.4% to 5.5% (p < .0001).

A recent study showed a potential benefit of including clinical risk factors into a CTG interpretation template [Citation27]. In our study the midwives and physicians classifying CTG were blinded to outcome to eliminate the risk of ascertainment bias [Citation28,Citation29]. Many clinical factors other than the CTG pattern are important for decision making, as well as a physiological understanding of the fetal heart rate changes [Citation30,Citation31], but this was not the focus of our study.

Of the three studied templates, the SWE-09 is the most demanding for the classification normal, with an upper normal baseline heart rate limit of 150 bpm, a requirement of accelerations, and it limits the acceptance of late or complicated variable decelerations. This may explain the superior sensitivity for acidemia. SWE-17 and FIGO-15 have higher demands for the classification pathological, requiring a minimum duration of repetitive late and complicated variable decelerations, which may explain higher specificity. Further studies of the separate variables included in the templates are planned. We will continue our research with a goal to optimize a template to achieve the highest possible validity in diagnosing fetal hypoxia with CTG.

Conclusion

For FIGO-15 and SWE-17 the sensitivity for pathological patterns to identify neonates with fetal acidosis is considered insufficient. SWE-09 is considered the most discriminatory template during the first stage of labor, combining a high sensitivity and a high specificity for pathological patterns. With the current knowledge, we would recommend adhering to previous guidelines emanating from FIGO guidelines from 1987, until any adjustments have been scientifically evaluated.

Supplemental material

Suplement_SWE09_gs.JPG

Download JPEG Image (114 KB)

Suplement_1_FIGO_gs.JPG

Download JPEG Image (117.5 KB)

Acknowledgements

We gratefully acknowledge all midwives and physicians who classified all these CTG tracings.

Disclosure statement

The authors declare no competing financial interest that have influenced the work reported in this paper. Andreas Herbst has contributed in the development of the Swedish classification systems from 2009 and 2017.

Additional information

Funding

This work was supported by research grants from Region Skåne and funding from LÖF, the national Swedish patient insurance company. The funders played no role in planning or conducting the research or writing of the paper.

References