I read with great interest the recent article published by Holmegard et al (1), who described observational associations of sex hormones with ischemic stroke in the community-based study. They applied a large cohort study with 1087 incident cases of stroke. Their study suggests an inverse association of testosterone with a high risk of stroke. The authors reported adjusted hazard ratios with mediation analyses for body mass index and hypertension (1). Moreover, it is unknown whether changes in sex hormones over time might have effects on the long-term risk of stroke. As fundamental goals of epidemiology, one should ask the following: 1) do observational associations provide evidence that a biomarker has additional predictive utility over and above relevant clinical information; and 2) to what extent does additional information on a biomarker contribute to the etiological understating of the disease and potentially identifying novel treatment targets?

Many biomarkers, eg, those related to genome-proteome-metabolite signals, and tens of prediction models have been producing to better predict the risk of developing multifactorial health outcomes in population-based or patient-based cohorts (24). As recommended, the performance of a certain prediction model should be rigorously assessed using classical prediction measures including calibration by goodness-of-fit statistics and discrimination on the basis of the c statistic (a rank classifier) (2, 5, 6). Discrimination is the ability of the model to distinguish those at high risk of developing the outcome of interest (or those who will get stroke, here in the study by Holmegard et al [1]) from those at low risk (those who will not get the outcome) (5). It is assessed using the receiver-operating characteristic curve, which is derived as a summary of 1-specificity (true negative rate) and sensitivity (true positive rate) of the model. The c statistic or the area under the receiver-operating characteristic curve ranges from 0.5 (threshed of random chance) to 1.0 (perfect discrimination) (5). Calibration addresses whether the actual predicted risk is in agreement with the proportion of individuals who developed the outcome (5).

In clinical prediction of future outcomes, the actual predicted risk does matter because differences between people at low, intermediate, or high risks may have the same impact on the c statistic (5). The c statistic is somewhat insensitive to changes in actual predicted risk for considering new risk predictors or biomarkers in risk prediction (5, 7). For example, sex hormones as predictors of stroke, even with a multivariable-adjusted hazard ratio of 1.46, may lead to little improvement in prediction when added to relevant clinical information. Here it is not possible to judge the predictive utility of sex hormones for stroke based on the results of the study by Holmegard et al (1) in the general population. Of note, the potential for risk stratification is paramount whether a new model with biomarkers can accurately reclassify patients into risk strata and change health care decision making (5, 7). Even if clinically relevant risk categories (eg, the cut points 5% and 20% for predicting the 10 year risk of cardiovascular disease (CVD)) are not available, one could use reclassification measures, eg, the category-free net reclassification improvement to assess the predictive utility of biomarkers (5, 7). Furthermore, whether statistical evidence of predictive value for a biomarker is clinically relevant depends on other considerations such as the reliability of the biomarker measurement, impact of sex and age on the risk estimates, the shape of associations, financial costs or related burden of such measurement, and potential risks to the patients (6, 7).

Another aspect of biomarker-disease associations is to explore whether biomarkers are causally associated with the outcome of interest. Observational studies cannot provide such evidence because of the issues of confounding and reverse causality (8). To investigate the nature of relationship between biomarkers and outcome, an integration of epidemiological and genetic data has been successfully introduced (called Mendelian randomization) because genetic variants can act as unconfounded proxies of certain traits (8, 9). A number of common genetic factors, using genome-wide association studies (GWAS), have been identified for cardiometabolic outcomes. However, these studies have shown that each of the common genetic factors confer only minimal risk to disease (9); and common outcomes, such as diabetes and CVD, share common genetic and environmental components. A key question is how to translate the associated genetic loci into biochemical variation in phenotypic function. The action of proteome and metabolome is integral to the function of genes and that the genome interacts with environment. In fact, GWAS provide the second source of information that is genotype-biomarker associations (9). The underlying concept is that if a given biomarker is causally related to CVD, the genetic variants that are associated with this biomarker shall also be associated with the outcome (8, 9). The search for well-designed GWAS, or the conduct of analyses in which there is no study, is important to address causal inference for biomarkers (from multi-omics data) and CVD.

Finally, rigorous pathway analyses and experimental studies, as complementary strands of research, regarding causal biomarkers provide better insights into the underlying mechanisms and perhaps introduce novel drug targets for the treatment and prevention of cardiometabolic outcomes.

Acknowledgments

This work was supported by The Netherlands Organization for Scientific Research (NWO) and the Medical Research Council United Kingdom (Grant MC_UU_12015/1). A.A. is supported by a Rubicon grant from the NWO (Project 825.13.004).

Disclosure Summary: The author has nothing to disclose.

Abbreviations

     
  • CVD

    cardiovascular disease

  •  
  • GWAS

    genome-wide association studies.

References

1.

Holmegard
HN
,
Nordestgaard
BG
,
Jensen
GB
,
Tybjaerg-Hansen
A
,
Benn
M
.
Sex hormones and ischemic stroke: a prospective cohort study and meta-analyses
.
J Clin Endocrinol Metab
.
In press
.

2.

Abbasi
A
,
Peelen
LM
,
Corpeleijn
E
, et al. .
Prediction models for risk of developing type 2 diabetes: systematic literature search and independent external validation study
.
BMJ (Clin Research Ed.)
2012
;
345
:
e5900
.

3.

Buijsse
B
,
Simmons
RK
,
Griffin
SJ
,
Schulze
MB
.
Risk assessment tools for identifying individuals at risk of developing type 2 diabetes
.
Epidemiol Rev
.
2011
;
33
:
46
62
.

4.

Echouffo-Tcheugui
JB
,
Kengne
AP
.
Risk models to predict chronic kidney disease and its progression: a systematic review
.
PLoS Med
.
2012
;
9
:
e1001344
.

5.

Cook
NR
.
Use and misuse of the receiver operating characteristic curve in risk prediction
.
Circulation
.
2007
;
115
:
928
935
.

6.

Moons
KG
,
Altman
DG
,
Reitsma
JB
, et al. .
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration
.
Ann Intern Med
.
2015
;
162
:
W1
W73
.

7.

Steyerberg
EW
,
Vedder
MM
,
Leening
MJ
, et al. .
Graphical assessment of incremental value of novel markers in prediction models: from statistical to decision analytical perspectives
.
Biometrische Zeitschrift
.
2015
;
57
:
556
570
.

8.

Lawlor
DA
,
Harbord
RM
,
Sterne
JA
,
Timpson
N
,
Davey Smith
G
.
Mendelian randomization: using genes as instruments for making causal inferences in epidemiology
.
Stat Med
.
2008
;
27
:
1133
1163
.

9.

Burgess
S
,
Timpson
NJ
,
Ebrahim
S
,
Davey Smith
G
.
Mendelian randomization: where are we now and where are we going?
Int J Epidemiol
.
2015
;
44
:
379
388
.