Publication Cover
Human Fertility
an international, multidisciplinary journal dedicated to furthering research and promoting good practice
Volume 26, 2023 - Issue 6
855
Views
0
CrossRef citations to date
0
Altmetric
Review Articles

Predictive models for starting dose of gonadotropin in controlled ovarian hyperstimulation: review and progress update

, , , , , , , , & show all
Pages 1609-1616 | Received 03 Aug 2023, Accepted 13 Nov 2023, Published online: 01 Dec 2023

Abstract

Controlled ovarian hyperstimulation (COH) is an essential for in vitro fertilization-embryo transfer (IVF-ET) and an important aspect of assisted reproductive technology (ART). Individual starting doses of gonadotropin (Gn) is a critical decision in the process of COH. It has a crucial impact on the number of retrieved oocytes, the cancelling rate of ART cycles, and complications such as ovarian hyperstimulation syndrome (OHSS), as well as pregnancy outcomes. How to make clinical team more standardized and accurate in determining the starting dose of Gn is an important issue in reproductive medicine. In the past 20 years, research teams worldwide have explored prediction models for Gn starting doses. With the integration of artificial intelligence (AI) and deep learning, it is hoped that there will be more suitable predictive model for Gn starting dose in the future.

Introduction

With the development of assisted reproductive technology (ART), an increasing number of infertile couples benefit from it. Previous studies have shown that the CPR increases with the number of retrieved oocytes (Fanton et al., Citation2023; Jia et al., Citation2022; Kim, Citation2023), but excessive oocyte retrieval can also increase the risk of ovarian hyperstimulation syndrome (OHSS) and costs. To obtain an ideal number of oocytes, personalized ovarian stimulation protocols are needed, which require appropriate starting dose and duration to extend the effect window of gonadotropin (Gn) as well as to promote more follicle development during controlled ovarian hyperstimulation (COH). However, ovarian response is often unpredictable. Up until now individualized stimulation protocols, including drug combinations, dosages, and adjuvant drugs, mainly rely on the experiences of doctors instead of scientific evidence. Several studies have formulated predictive models for the initiation dose of Gonadotropin (Gn) using diverse indicators. We aim to conduct a comprehensive review of the literature, and assessment of the current predictive models.

Development on predictive model for starting dose of Gn

Discrimination is a measure of the diagnostic ability of a model. The Receiver Operating Characteristic (ROC) curve is created by plotting the true positive rate (sensitivity) on the y-axis and the false positive rate (1-specificity) on the x-axis. In logistic regression models, the area under the ROC curve (AUC) is equivalent to the concordance index (C-index), which is the most commonly used indicator for evaluating discrimination. To assess the model’s predictive ability, the C-index is employed to measure the association between observed and predicted doses according to the pre-defined predictive factors. The assessment of the calibration of predictive models is typically carried out through the calibration curve, which reflects the degree of agreement between the predicted probability of an individual experiencing a specific outcome event and the actual probability. The calibration essentially involves plotting the observed and predicted occurrence rates on a scatter plot. The closer the calibration curve aligns with the reference line, the better the calibration of the model and the more accurate the predictive model. e.g. C-index of 59.5% indicates that approximately 60% of cases can be accurately dosed. External validation further substantiates the stability and credibility of the predictive model. However, if the model is over fitted to the original data, it may perform poorly in other populations and necessitate adjustments to enhance its performance.

CONSORT formula

Many studies have attempted to develop scoring systems or mathematical models using parameters to predict ovarian response, thereby calculating the recommended initial Gn dose.

Howles et al. (Citation2006) first proposed a predictive model to determine the starting dose of Gn, which includes age, body mass index (BMI), basal follicle stimulating hormone (FSH), and antral follicle count (AFC) as explanatory variables, and obtains oocytes as the primary outcome through backward stepwise multiple regression. The study included a large sample of 1,378 infertile patients under 35 years old who obtained 8-14 oocytes, from multiple centres. Using four predefined factors: serum FSH level, BMI, age, and number of follicles <11 mm, resulted in C- index pf 59.5% (). The authors have developed an r-hFSH starting dose calculator based on this final treatment model, which is currently being used in a new prospective trial to predict the optimal Gn starting dose for achieving desirable ART outcomes. This model is also known as the classic CONSORT (consistency in r-FSH starting doses for individualized treatment) formula. However, due to the insufficient fitness, the single (long agonist) protocol and the lack of indication for adjustment of Gn dosage during ovarian stimulation, the application of the CONSORT formula is not widespread (Howles et al., Citation2006). In 2011, a prospective clinical study enrolled 161 infertile women aged 18-34. Dose of Gn was adjusted, using CONSORT formula (). The recommended initial Gn dose was 75-300 IU, obtained 6-24 oocytes and the dose was reduced when OHSS risk was predicted during ovarian stimulation. The incidence of OHSS was 1.24% (Olivennes et al., Citation2011).

Table 1. Characteristics of predictive models for starting dose of gonadotropin.

Shen and his colleague introduced anti-Mullerian hormone (AMH) into the formula (referred to as the modified CONSORT formula). Subsequently, they replaced the initial FSH dose with the average FSH dose in the formula (referred to as the corrected CONSORT formula). Finally, they incorporated both the AMH variable and the replacement of the initial FSH dose with the average FSH dose into the formula (referred to as the corrected & modified CONSORT formula). Through multiple linear regression analysis, the classical CONSORT formula, modified CONSORT formula, corrected CONSORT formula, and corrected & modified CONSORT formula were obtained. Applying these formulas, the number of theoretical oocytes retrieved was calculated by substituting the variables, and then the relationship between theoretical oocytes retrieved and actual oocytes retrieved was analyzed. The merits and weaknesses of each formula were analyzed and verified. The statistical analysis indicated that the modified CONSORT formula (R2=0.694, P < 0.0001, ∑(S-Y)2=15.42) outperformed the classical CONSORT formula (R2=0.693, P < 0.0001, ∑(S-Y)2=17.05). The larger the R square (R2), the smaller the variance, and the higher the fitness of the predictive formula. Furthermore, the corrected CONSORT formula (R2=0.696, P < 0.0001, ∑(S-Y)2=13.91) demonstrated a higher level of accuracy compared to the classical CONSORT formula. Notably, the corrected & modified CONSORT formula (R2=0.698, P ≈ 0, ∑(S-Y)2=10.87) exhibited superiority over the other three formulas (). This algorithm can accurately predict ovarian response and patient treatment outcomes. However, the limitation of this study is its small sample size of 130 cases. Additionally, when compared to the starting dose of FSH, the clinical reference value of the average dose has limited applicability (Yan & Shen, Citation2017).

Linear regression-based computational model

In order to translate the predictive model into clinical practice, La Marca et al. (Citation2013) enrolled 505 infertile women under 40 years old with regular menstruation in creating a Nomogram chart using age, AMH, and basal FSH to determine recommended initial Gn dose using backward stepwise multiple regression. In this study, it was found that age, serum FSH levels, and AFC were significant predictors of ovarian response. A column chart was created by fixing the number of oocytes retrieved at 9, which narrowed the distribution of ovarian response near this value. Although this resulted in a decrease in model R2, it meant that fewer women would exhibit insufficient responses (i.e. poor-or over-reacting) (). Due to the strong correlation between anti-Müllerian hormone (AMH) and AFC, a second simplified model based solely on age and AFC was developed, which is independent of patient serological indicators and may have potential application value for clinicians (). However, the two-variable model was not as accurate as the model that combined age, FSH levels, and AFC (La Marca et al., Citation2013).

Moon et al. (Citation2016) included 141 infertile women with regular menstrual cycles and received antagonistic or long-acting treatment regimens. They used a generalized linear model, including Poisson distribution and Logarithmic link function, to predict the number of oocytes retrieved. The univariate and multivariate analyses revealed that age, basal serum FSH and AMH level, and AFC were significantly related to the number of oocytes retrieved. By combining these indicators of ovarian reserve, four models for predicting the number of oocytes obtained were generated: model (1) 3.21–0.036 × (age)+0.089 × (AMH), model (2) 3.422–0.03 × (age)–0.049 × (FSH)+0.08 × (AMH), model (3) 2.32–0.017 × (age)+0.039 × (AMH)+0.03 × (AFC), and model (4) 2.584–0.015 × (age)–0.035 × (FSH)+0.038 × (AMH)+0.026 × (AFC). The four models were evaluated based on their adaptability, discriminant evaluation, and analytical cross-validation from three aspects. Finally, model (4) exhibited the best fitness, followed by model (3). The Akaike Information Criterion (AIC) and Bayesian information criterion (BIC) values of the two models were very close (). In terms of discrimination evaluation, model (3) had the highest Spearman coefficient, followed by model (4), and the cross-validation results were consistent with this order. On the calibration plot, the observed number of oocytes in models (3) and (4) perfectly matched the predicted number of oocytes, making it clinically useful. Model (4) did not have any advantage over model (3) in terms of model evaluation, which was consistent with our clinical experience. Due to the large fluctuation of FSH in different IVF cycles of the same patient, its clinical role was limited. However, this study has some limitations, including a small sample size of only 141 cases and more importantly, poor consistency among the samples, with a mean age of 36 years (range 26-49 years), a mean AMH of 3.30 ± 3.28 (range 0.08-17.25), and a mean FSH of 8.8 ± 4.0 (range 2.9-26.6) (Moon et al., Citation2016).

Zhu et al. (Citation2019) established a multivariate linear regression model using age, AFC, and AMH as variables based on the characteristics of the Chinese population. The study included 673 patients with normal ovarian response who obtained 4-14 oocytes and a formula for calculating the starting dose. This model demonstrated the ability to accurately predict ovarian sensitivity. The ovarian responsiveness of 57.2% of patients aligns with the expectations of the model. In this population, 46.88% of patients in the high FSH starting dose group utilized the step-down protocol, while 57.92% of patients in the low FSH starting dose group employed the step-up protocol throughout the treatment process (Zhu et al., Citation2019) (). This predictive model was tested in 750 distinct populations at another centre, and a comprehensive evaluation of the model was conducted. Among the three groups, the proportion of patients with ≥15 retrieved oocytes was 64.40%, the highest in the high FSH starting dose group. The proportion of patients with ≤7 oocytes was only 6.2%, the lowest in the three groups (Zhu et al., Citation2019).

Predictive model based on different algorithms

Letterie and MacDonald (Citation2020) included 2,603 ART cycles with long agonist protocol and antagonist protocol. They established a computer algorithm for the determination of starting dose of Gn based on age, diagnosis, AFC, and AMH. This study aimed to evaluate five predictive analysis methods - Classification and Regression Tree, Random Forest, Support Vector Machine, Logistic Regression, and Neural Network algorithm. The researchers identified the unique parameter settings for each method and adjusted them to develop the most optimal predictive model. These adjusted models were incorporated into the final algorithm which was then trained using a complete training dataset. Finally, the trained model was validated using an unused dataset to assess its effectiveness on decision-making, which is compared with the evidence-based medical decisions made by clinical physicians. The processes of evidence-based medical decision combine the clinical expertise of ten certified reproductive endocrinologists with ten to twenty years of experience from the committee with the best available evidence from previous systematic research. After analysis, it was found that the algorithm accuracy for the four decision types was as high as 0.92 for continue or stop treatment, 0.96 for trigger or cancel of cycles, 0.82 for dosage of medications, and 0.87 for days to follow-up. The algorithm was stable overall, although the initial dose decision of Gn is not sufficiently precise. Furthermore, its predominance approach to dosage decisions involved maintaining a stable level every day within a cycle, thereby minimizing the number of dose adjustments (). This study marks the beginning of integrating AI, predictive analysis models, and clinical evidence-based conclusions to formulate clinical decision-making (Letterie & MacDonald, Citation2020).

Another study enrolled 18,591 ovarian stimulation cycles from three centres using the KNN algorithm of machine learning models in 2022. After five cross-validations, the K value was finally determined to be 100. That is, before predicting the starting dose for each new patient, the 100 patients with the closest baseline indicators, such as age, AMH, BMI, and AFC, to the new patient, were selected. The graph based on the relationship between their starting dose of Gn and MII oocyte count was plotted to establish the dose-response model for the new patient. Patients were divided into a curvilinear type and a rectilinear type according to the different curves in the model. For the curvilinear type, the goal was to obtain more MII oocytes without increasing the starting dose of Gn, that is, to find the optimal dose. For the rectilinear type, the goal was to obtain a comparable number of MII oocytes with the lowest starting dose of Gn (). The model is relatively intuitive and can maximize the benefits for patients of different types, but the computation of all data is required for each prediction, which is computationally intensive, and the population included has certain limitations (Fanton et al., Citation2022).

Poisson model

Ebid et al. (Citation2021) included 636 IVF cycles with long, short, and antagonist protocols for the study. The modified Poisson, negative binomial, mixed Poisson-Emax, and linear models were used to analyze the relationship between the starting dose of Gn and ovulation outcomes based on age, AFC, FSH, and the ovarian stimulation protocol type. The Poisson model with a log-link function showed excellent predictive performance and accuracy. The outcome measures of this study were the number of retrieved oocytes and the ovarian response prediction index (ORPI) which was calculated as (AMH level × AFC)/age (). In the proposed Poisson model of this study, the Emax function, a nonlinear index, replaced the traditional exponential linear function to explain the relatively flat dose-response relationship obtained at higher FSH doses while allowing for the discrete prediction of oocyte count. The proposed modified Poisson model explained the nonlinear relationship between FSH dose & oocyte count, while fitting the best with the highest accuracy among many models. Furthermore, this study found that Gn dose could significantly affect oocyte count, and various dose-response models were obtained and analyzed with more heterogeneous groups and different response patterns, but the study was limited to the IVF cases from a single centre (Ebid et al., Citation2021).

Oocyte sensitivity index model

Kobanawa (Citation2023) proposed using the oocyte sensitivity index (OSI) as a dynamic indicator to evaluate ovarian response recently, which is defined as the number of oocytes that can be obtained from each starting dose of Gn. OSI is positively correlated with the number of retrieved oocytes, and negatively correlated with starting dose of Gn and the duration of COH. The study included 100 IVF cycles from 100 individual patients, and age, AMH, and basal FSH were used as variables to analyze OSI using stepwise multiple regression. Finally, age and AMH were selected as predictive variables to establish an algorithm based on a multiple regression equation. After inputting the optimal number of retrieved oocytes, the number of days of Gn stimulation, AMH, and age, the recommended starting dose could be obtained. The accuracy of this model was verified to be above 70%, achieving personalized prediction of the starting dose for the Japanese population (). However, this study has two limitations: some patients were given HMG, which may have variable FSH activity depending on the lot of drugs, and the impact of different triggering methods such as hCG and GnRH agonists on oocyte maturation was not considered (Kobanawa, Citation2023).

Limitations in the current models

The existing regression models in the current researches have insufficient goodness-of-fit, with C-index ≤70% in most studies. Different studies have used different criteria to enroll patients and different primary outcomes, with most using the number of retrieved oocytes as the primary outcome, while some studies have used MII oocyte rate, OSI, etc. No studies have considered the impact of different triggering methods on oocyte quality, and none have performed subgroup analysis on FSH prepared by different methods including urine-derived, recombinant, and high-purity FSH. However, different stimulating drugs and triggering medications may significantly affect the primary outcomes. Most studies did not consider the effect of Gn dose adjustment during COH on the treatment outcomes. In addition, the explanatory variables included in current researches, such as age, AMH, AFC, BMI, and AMH, are relatively limited, and there may be other variables that affect Gn starting dose that have not been analyzed or discovered.

Suggestions for future research

About analysis methods

From the statistical perspective, the regression analysis model has strict requirements for data, including the residual to follow a normal distribution, no multicollinearity among independent variables, and no correlation between samples. Only after meeting these conditions, can we ensure that the results of regression analysis are real and stable. Using different models such as curve equations, segmented regression, and spline regression based on linear regression may increase the goodness-of-fit. R square (R2) is the most convenient and straightforward indicator for judging the goodness-of-fit in the regression model, and the larger the value of R2, the higher the degree of goodness-of-fit. However, the disadvantage of R2 is evident. Even if the correlation between the independent variable and dependent variable is not significant, R2 will increase when more independent variables are added to the regression model. To overcome this defect, adjusted R2 was introduced, which takes into account the influence of sample size and independent variables and avoids R2 approaching 1 as the independent variable increases. Abnormal values in the sample can seriously affect the goodness-of-fit of the regression model.

Cross-validation is a commonly used method in machine learning to verify the matching degree of the model. This method randomly divides the sample data into a training set and a validation set, uses the training set to estimate the model, and then uses the validation set to check the goodness-of-fit of the model. If the sample size is small, dividing the training set and validation set will result in a small number of samples in each group, which will increase the model error. In this case, the ‘leave-one-out’ method is used to remove one observation at a time from the sample, and then the remaining observations are used to estimate the model. Afterwards, the theoretical value of the observation object that was removed earlier is calculated using the estimated model and compared with the true value of the observation object to obtain a difference. Each observation sample is treated in turn according to the above process, and the number of regression analyses is the same as the number of observation objects. Compared with the previous method of random division of the training set and validation set, the ‘leave-one-out’ method can maximize the sample capacity, obtain an unbiased estimate based on the error value, and improve the goodness-of-fit of the regression model.

About the research contents

In future studies, the research contents can be optimized from three aspects. Firstly, different studies have different ranges of the number of retrieved oocytes from patients with normal responses. Some studies have a range of 8-14, while others have a range of 4-14. An expert consensus from China suggests that the most suitable range of oocyte numbers for Chinese patients with normal ovarian response in COH of ART circles is 5-15. In this case, the oocyte maturation rate is high, the quality is good, and a better IVF outcome can be achieved. Of course, this consensus is mainly based on comprehensive evaluation of the ovarian responsiveness of patients according to their age, ovarian response, and whether there is ovarian hypo or hyper-response in previous ART circles. Secondly, only focusing on the number of retrieved oocytes cannot fully reflect the clinical efficacy of ovulation induction drugs and their dosage. If the treatment goal is to obtain good embryos and clinical pregnancy, using MII oocyte number, MII oocyte rate, or follicle output rate as outcome indicators may be more meaningful for research. Thirdly, in the process of COH, especially within the first 3-5 days after initiation, adjusting the dosage of Gn may indicate that the follicle sensitivity to the starting dose has not reached the expected level. The inclusion of such cases may affect the outcome indicators and the reliability of the predictive model.

About the hidden variables

From the variables that affect the starting dose of Gn, there may be other variables which have not been included in the analysis or discovered, known as ‘hidden variables’, such as the fertility index of endometriosis, a history of pelvic surgery or chronic pelvic inflammatory disease, or polymorphisms in genes related to abnormal follicle development in patients themselves. All of these may affect the starting dose of Gn but have not been included in the predictive model.

Summary and prospect

Accurate and individualized starting doses of Gn can reduce the fluctuation of the number of retrieved oocytes, ensure appropriate ovarian response, increase the number of oocytes in patients, reduce the incidence of OHSS in patients with ovarian hyper-response, and at the same time reduce cycle cancelling rates, as well as clinical pregnancy outcomes. However, the assessment of starting doses of Gn is more complex. So far, customized individualized doses mainly depend on the personal clinical experience of the physician. Therefore, an independent and objective predictive tool for starting doses of Gn will bring great reference value to the work of clinical physicians. Since 2006, numerous research teams have explored predictive models for starting doses of Gn and have obtained positive results in subsequent validation studies. In recent years, the important branch of AI in the medical field - deep learning - has been widely used in automated diagnosis based on clinical imaging and large-scale screening. AI predictive models have begun to be applicable to a variety of diseases and are gradually being transformed or integrated into real-world clinical practice. In the field of ART, how to apply and deploy AI in real clinical scenarios has gradually become the research hotspot of the next step. AI related predictive analytics and machine learning may surpass traditional clinical decision-making methods, and may play an important role in IVF patient management

Authors’ contributions

LH, YW and XG determined the topic, LH and XG wrote the manuscript, HZ, KL, YP and XZ retrieved literatures, HZ, HX, BZ and PD analyzed data, YW and LH revised the manuscript. All authors contributed to the article and approved the submitted version.

Disclosure statement

The authors declared no conflicts of interest with respect to the research, authorship and publication of this article.

Additional information

Funding

This work was supported by the Natural Science Foundation of Shandong Province (NO. ZR2023MH222); Scientific Research Fund of Binzhou Medical University (BY2021KYQD34); Scientific Research Initiation Fund of Binzhou Medical University Hospital (2021-03).

References

  • Ebid, A., Motaleb, S. M. A., Mostafa, M. I., & Soliman, M. M. A. (2021). Novel nomogram-based integrated gonadotropin therapy individualization in in vitro fertilization/intracytoplasmic sperm injection: A modeling approach. Clinical and Experimental Reproductive Medicine, 48(2), 163–173. https://doi.org/10.5653/cerm.2020.03909
  • Fanton, M., Cho, J. H., Baker, V. L., & Loewke, K. (2023). A higher number of oocytes retrieved is associated with an increase in fertilized oocytes, blastocysts, and cumulative live birth rates. Fertility and Sterility, 119(5), 762–769. https://doi.org/10.1016/j.fertnstert.2023.01.001
  • Fanton, M., Nutting, V., Rothman, A., Maeder-York, P., Hariton, E., Barash, O., Weckstein, L., Sakkas, D., Copperman, A. B., & Loewke, K. (2022). An interpretable machine learning model for individualized gonadotrophin starting dose selection during ovarian stimulation. Reproductive Biomedicine Online, 45(6), 1152–1159. https://doi.org/10.1016/j.rbmo.2022.07.010
  • Howles, C. M., Saunders, H., Alam, V., & Engrand, P; FSH Treatment Guidelines Clinical Panel. (2006). Predictive factors and a corresponding treatment algorithm for controlled ovarian stimulation in patients treated with recombinant human follicle stimulating hormone (follitropin alfa) during assisted reproduction technology (ART) procedures. An analysis of 1378 patients. Current Medical Research and Opinion, 22(5), 907–918. https://doi.org/10.1185/030079906X104678
  • Jia, R., Liu, Y., Jiang, R., Zhu, X., Zhou, L., Chen, P., Cao, M., & Zhao, Z. (2022). The optimal number of oocytes retrieved from PCOS patients receiving IVF to obtain associated with maximum cumulative live birth rate and live birth after fresh embryo transfer. Frontiers in Endocrinology, 13, 878214. https://doi.org/10.3389/fendo.2022.878214
  • Kim, H. H. (2023). More is better: oocyte number and cumulative live birth rate. Fertility and Sterility, 119(5), 770–771. https://doi.org/10.1016/j.fertnstert.2023.03.027
  • Kobanawa, M. (2023). The gonadotropins starting dose calculator, which can be adjusted the target number of oocytes and stimulation duration days to achieve individualized controlled ovarian stimulation in Japanese patients. Reproductive Medicine and Biology, 22(1), e12499. https://doi.org/10.1002/rmb2.12499
  • La Marca, A., Grisendi, V., Giulini, S., Argento, C., Tirelli, A., Dondi, G., Papaleo, E., & Volpe, A. (2013). Individualization of the FSH starting dose in IVF/ICSI cycles using the antral follicle count. Journal of Ovarian Research, 6(1), 11. https://doi.org/10.1186/1757-2215-6-11
  • Letterie, G., & MacDonald, A. (2020). Artificial intelligence in in vitro fertilization: a computer decision support system for day-to-day management of ovarian stimulation during in vitro fertilization. Fertility and Sterility, 114(5), 1026–1031. https://doi.org/10.1016/j.fertnstert.2020.06.006
  • Moon, K. Y., Kim, H., Lee, J. Y., Lee, J. R., Jee, B. C., Suh, C. S., Kim, K. C., Lee, W. D., Lim, J. H., & Kim, S. H. (2016). Nomogram to predict the number of oocytes retrieved in controlled ovarian stimulation. Clinical and Experimental Reproductive Medicine, 43(2), 112–118. https://doi.org/10.5653/cerm.2016.43.2.112
  • Olivennes, F., Howies, C. M., Borini, A., Germond, M., Trew, G., Wikland, M., Zegers-Hochschild, F., Saunders, H., & Alam, V. (2011). Individualizing FSH dose for assisted reproduction using a novel algorithm: the CONSORT study. Reproductive BioMedicine Online, 22(Suppl 1), S73–S82. https://doi.org/10.1016/S1472-6483(11)60012-6
  • Yan, L., & Shen, H. (2017). Clinical value of determining serum AMH in GnRH agonist long protocol– a modified CONSORT algorithm for individualized dosing of FSH. Journal of Reproductive Medicine (China), 26(7), 634–639. https://d.wanfangdata.com.cn/periodical/szyxzz201707004
  • Zhu, M., Wang, S., Yi, S., Huang, X., Meng, J., Chen, L., Sun, H., & Zhou, J. (2019). A predictive formula for selecting individual FSH starting dose based on ovarian reserve markers in IVF/ICSI cycles. Archives of Gynecology and Obstetrics, 300(2), 441–446. https://doi.org/10.1007/s00404-019-05156-2