Sir,

Thomsen et al. (2018) recently published a prospective, observational multicentric study evaluating the prognostic value of early and mid-luteal serum progesterone concentration for the achievement of live birth after fresh embryo transfer. Based on data from 602 IVF cycles, the authors conclude that ‘low as well as high serum P4 levels in the early and mid-luteal phase reduce the chance of a live birth following IVF treatment with fresh embryo transfer’. This is an important finding that could potentially change patient management. However, after carefully reading the article we have strong reservations about the validity of this conclusion.

Firstly, the authors have tried to identify specific cut-offs or optimal ranges of serum progesterone for the prediction of live birth. They have done that in a data-dependent way since they state that ‘progesterone groups were defined based on raw data of reproductive outcomes and luteal P4 levels during the early and mid-luteal phase separately’. However, the authors do not provide information on whether a formal statistical routine was used for the identification of these cut-offs or it was the result of pure data-dredging. Data-dependent approaches are generally known to bear the risk of a type I error, i.e. finding an effect in the sample while no real difference exists in the population (Altman, 1991; Altman et al., 1994). For that reason, such approaches are considered highly problematic and can, at most, be used to generate hypotheses that need to be validated in further studies in order to gain any clinical value.

Furthermore, when the data of the present study are analysed in an objective way, i.e. in 10/50/90 or 25/50/75 percentiles, the finding of a non-linear effect with a negative association for low and high serum progesterone values no longer exists. As can be seen in Supplementary Fig. S6, when the data are grouped according to the 25/50/75 percentiles, then in the early luteal phase group, low progesterone seems to be associated with the highest probability of clinical pregnancy. Similarly, in the mid-luteal serum progesterone group, when data are group/analysed according to the 10/50/90 percentiles, high progesterone seems to be associated with the highest probability of clinical pregnancy. These kinds of inconsistencies from a statistical and physiological point of view create strong doubts regarding the validity of the authors’ claim that ‘low as well as high serum P4 levels in the early and mid-luteal phase reduce the chance of a live birth following IVF treatment with fresh embryo transfer’.

Instead of reaching any conclusion by just looking at diagrams (i.e. using visual statistics) the actual effect sizes should be examined and most importantly the 95% CIs of these effect sizes. Surprisingly, although Figs 2C and 2F seem to justify the conclusion of the authors, the results presented in Table III do not support it, since in both the early and the mid-luteal group, the 95% CIs of all adjusted ORs for live birth when compared with the reference group (the ‘optimal’ group as defined by the authors) include 1.0, thus indicating the lack of statistically significant differences. Therefore, despite the use of a data-dependent approach, the authors’ claimed association cannot be supported on the basis of the data presented, something which is not emphasized in the Results section and not discussed as a limitation in the Discussion section.

Lastly, it should be noted that the registered trial protocol (NCT02129998) reports a target sample size of ~900 patients and declares ongoing pregnancy as the primary endpoint. The sample size in the published study is 602 patients and ongoing pregnancy is not reported.

In conclusion, this is a prospective observational study without a priori defined cut-offs, discrepancies between the registered protocol and the eventual publication, arriving at a conclusion that is not statistically supported by the data-dependent analysis performed. These significant flaws seriously question the conclusion in the study by Thomsen et al. (2018) and need to be brought to the attention of the readership of Human Reproduction.

Conflict of interest

C.A.V. is supported by a National Health and Medical Research Council (NHMRC) Early Career Fellowship (GNT1147154). C.A.V. also reports grants, personal fees and non-financial support from Merck, personal fees and non-financial support from Merck, Sharp & Dohme, non-financial support from Ferring, outside the submitted work. B.W.M. is supported by a NHMRC Practitioner Fellowship (GNT1082548). B.W.M. reports consultancy for ObsEva, Merck Merck KGaA and Guerbet, outside the submitted work. E.M.K. reports no conflicts of interest.

References

Altman
DG
.
Categorising continuous variables
.
Br J Cancer
1991
;
64
:
975
.

Altman
DG
,
Lausen
B
,
Sauerbrei
W
,
Schumacher
M
.
Dangers of using ‘optimal’ cutpoints in the evaluation of prognostic factors
.
J Natl Cancer Inst
1994
;
86
:
829
835
.

Thomsen
LH
,
Kesmodel
US
,
Erb
K
,
Bungum
L
,
Pedersen
D
,
Hauge
B
,
Elbaek
HO
,
Povlsen
BB
,
Andersen
CY
,
Humaidan
P
.
The impact of luteal serum progesterone levels on live birth rates-a prospective study of 602 IVF/ICSI cycles
.
Hum Reprod
2018
;
33
:
1506
1516
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)