-
PDF
- Split View
-
Views
-
Cite
Cite
David Gómez de Antonio, Silvana Crowley Carrasco, Alejandra Romero Román, Ana Royuela, Mariana Gil Barturen, Carme Obiols, Sergi Call, Ínigo Royo, José Luis Recuero, Alberto Cabanero, Nicolás Moreno, Raúl Embún, Spanish Group of Video Assisted Thoracic Surgery (GEVATS) , External validation of the European Society of Thoracic Surgeons morbidity and mortality risk models, European Journal of Cardio-Thoracic Surgery, Volume 62, Issue 3, September 2022, ezac170, https://doi.org/10.1093/ejcts/ezac170
- Share Icon Share
Abstract
There is a wide variety of predictive models of postoperative risk, although some of them are specific to thoracic surgery, none of them is widely used. The European Society for Thoracic Surgery has recently updated its models of cardiopulmonary morbidity (Eurolung 1) and 30-day mortality (Eurolung 2) after anatomic lung resection. The aim of our work is to carry out the external validation of both models in a multicentre national database.
External validation of Eurolung 1 and Eurolung 2 was evaluated through calibration (calibration plot, Brier score and Hosmer–Lemeshow test) and discrimination [area under receiver operating characteristic curves (AUC ROC)], on a national multicentre database of 2858 patients undergoing anatomic lung resection between 2016 and 2018.
For Eurolung 1, calibration plot showed suboptimal overlapping (slope = 0.921) and a Hosmer–Lemeshow test and Brier score of P = 0.353 and 0.104, respectively. In terms of discrimination, AUC ROC for Eurolung 1 was 0.653 (95% confidence interval, 0.623–0.684). In contrast, Eurolung 2 showed a good calibration (slope = 1.038) and a Hosmer–Lemeshow test and Brier score of P = 0.234 and 0.020, respectively. AUC ROC for Eurolung 2 was 0.760 (95% confidence interval, 0.701–0.819).
Thirty-day mortality score (Eurolung 2) seems to be transportable to other anatomic lung-resected patients. On the other hand, postoperative cardiopulmonary morbidity score (Eurolung 1) seems not to have sufficient generalizability for new patients.
INTRODUCTION
Surgical risk prediction models are an invaluable tool for assessing perioperative results, counselling patients and benchmarking [1–4].
Although there are a number of risk models for thoracic surgery, their use in the clinical practice is far from widespread [5–9].
In 2017, the European Society for Thoracic Surgeons (ESTS) Database Committee published the first risk prediction models (Eurolung 1 for cardiopulmonary morbidity and Eurolung 2 for 30-day mortality) after anatomical lung resection, based on data from around 40 000 patients [5].
A recent update with a more parsimonious model has recently been published, retaining 5 variables for Eurolung 1 [age > 70, male sex, predicted postoperative forced expiratory volume in 1 s (ppoFEV1) < 70%, extended resection and open approach] and 6 for Eurolung 2 (age > 70, male sex, ppoFEV1 < 70%, open approach, body mass index < 18.5 and pneumonectomy) [6].
If these scores are to be extensively used by the community of thoracic surgeons, they should be externally validated in a different dataset, which is the scope of our work.
MATERIALS AND METHODS
Ethics statement
The project has been approved by the respective research ethics committees of all participating centres. All patients signed a specific informed consent for the use of their clinical data for scientific purposes.
Modelling cohort
ESTS dataset (Table 1) contained information from 82 383 anatomic lung resections performed from 2007 to 2018 in over 200 European contributing units [6], using standardized definitions of variables and outcomes agreed upon in a joint Society of Thoracic Surgeons–ESTS publication in 2015 [4].
European Society for Thoracic Surgery and Spanish Group of Video Assisted Thoracic Surgery databases comparison
. | ESTS . | GEVATS . |
---|---|---|
Time period | 2007–2018 | 2016–2018 |
N | 82 383 | 2858 |
Age (years) | 64.6 (57.6–71.2) | 66 (59–72) |
Male gender | 53 780 (65) | 2015 (70) |
BMI (kg/m2) | 25.1 (22.4–28.3) | 26.5 (23.7–29.6) |
CVD | 2434 (3) | 148 (5) |
CAD | 6725 (8.2) | 253 (7.9) |
Thoracotomy | 61 252 (74) | 1313 (46) |
ppoFEV1 (%) | 73 (59–87) | 69 (57–82) |
Extended resections | 4722 (5.7) | 144 (5) |
CKD | 4579 (5.6) | 72 (2.5) |
Cardiopulmonary morbidity (%) | 15.7 | 12.4 |
30-Day mortality (%) | 2.2 | 2.3 |
. | ESTS . | GEVATS . |
---|---|---|
Time period | 2007–2018 | 2016–2018 |
N | 82 383 | 2858 |
Age (years) | 64.6 (57.6–71.2) | 66 (59–72) |
Male gender | 53 780 (65) | 2015 (70) |
BMI (kg/m2) | 25.1 (22.4–28.3) | 26.5 (23.7–29.6) |
CVD | 2434 (3) | 148 (5) |
CAD | 6725 (8.2) | 253 (7.9) |
Thoracotomy | 61 252 (74) | 1313 (46) |
ppoFEV1 (%) | 73 (59–87) | 69 (57–82) |
Extended resections | 4722 (5.7) | 144 (5) |
CKD | 4579 (5.6) | 72 (2.5) |
Cardiopulmonary morbidity (%) | 15.7 | 12.4 |
30-Day mortality (%) | 2.2 | 2.3 |
Results are expressed as median and interquartile range for numeric variables and as count and percentage of the total for categorical variables.
BMI: body mass index; CAD: coronary artery disease; CKD: chronic kidney disease; CVD: cerebrovascular disease; ESTS: European Society for Thoracic Surgery; GEVATS: Spanish Group of Video Assisted Thoracic Surgery; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.
European Society for Thoracic Surgery and Spanish Group of Video Assisted Thoracic Surgery databases comparison
. | ESTS . | GEVATS . |
---|---|---|
Time period | 2007–2018 | 2016–2018 |
N | 82 383 | 2858 |
Age (years) | 64.6 (57.6–71.2) | 66 (59–72) |
Male gender | 53 780 (65) | 2015 (70) |
BMI (kg/m2) | 25.1 (22.4–28.3) | 26.5 (23.7–29.6) |
CVD | 2434 (3) | 148 (5) |
CAD | 6725 (8.2) | 253 (7.9) |
Thoracotomy | 61 252 (74) | 1313 (46) |
ppoFEV1 (%) | 73 (59–87) | 69 (57–82) |
Extended resections | 4722 (5.7) | 144 (5) |
CKD | 4579 (5.6) | 72 (2.5) |
Cardiopulmonary morbidity (%) | 15.7 | 12.4 |
30-Day mortality (%) | 2.2 | 2.3 |
. | ESTS . | GEVATS . |
---|---|---|
Time period | 2007–2018 | 2016–2018 |
N | 82 383 | 2858 |
Age (years) | 64.6 (57.6–71.2) | 66 (59–72) |
Male gender | 53 780 (65) | 2015 (70) |
BMI (kg/m2) | 25.1 (22.4–28.3) | 26.5 (23.7–29.6) |
CVD | 2434 (3) | 148 (5) |
CAD | 6725 (8.2) | 253 (7.9) |
Thoracotomy | 61 252 (74) | 1313 (46) |
ppoFEV1 (%) | 73 (59–87) | 69 (57–82) |
Extended resections | 4722 (5.7) | 144 (5) |
CKD | 4579 (5.6) | 72 (2.5) |
Cardiopulmonary morbidity (%) | 15.7 | 12.4 |
30-Day mortality (%) | 2.2 | 2.3 |
Results are expressed as median and interquartile range for numeric variables and as count and percentage of the total for categorical variables.
BMI: body mass index; CAD: coronary artery disease; CKD: chronic kidney disease; CVD: cerebrovascular disease; ESTS: European Society for Thoracic Surgery; GEVATS: Spanish Group of Video Assisted Thoracic Surgery; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.
The following cardiopulmonary complications collected in the ESTS database were listed: respiratory failure, need for reintubation, prolonged mechanical ventilation, pneumonia, atelectasis requiring bronchoscopy, pulmonary oedema, pulmonary embolism, acute respiratory distress syndrome/acute lung injury, arrhythmia requiring treatment, acute myocardial ischaemia, acute heart failure, transient ischaemic stroke/stroke and acute kidney injury [5].
Mortality was defined as death in the hospital or within 30 days of surgery if the patient was discharged [5].
Non-anatomical lung resections or cases without information regarding mortality or perioperative complications were excluded.
Random samples from ESTS Institutional Accreditation Programme eligible units were audited.
For updating Eurolung 1 and 2 risk models, only predictors associated with the outcomes in the previous version of the model [5] were included (Table 2).
Eurolung 1 | Age, sex, ppoFEV1, CAD, CVD, CKD, thoracotomy and extended resection |
Eurolung 2 | Age, sex, ppoFEV1, CAD, CVD, pneumonectomy, thoracotomy, extended resection and BMI |
Eurolung 1 | Age, sex, ppoFEV1, CAD, CVD, CKD, thoracotomy and extended resection |
Eurolung 2 | Age, sex, ppoFEV1, CAD, CVD, pneumonectomy, thoracotomy, extended resection and BMI |
BMI: body mass index; CAD: coronary artery disease; CKD: chronic kidney disease; CVD: cerebrovascular disease; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.
Eurolung 1 | Age, sex, ppoFEV1, CAD, CVD, CKD, thoracotomy and extended resection |
Eurolung 2 | Age, sex, ppoFEV1, CAD, CVD, pneumonectomy, thoracotomy, extended resection and BMI |
Eurolung 1 | Age, sex, ppoFEV1, CAD, CVD, CKD, thoracotomy and extended resection |
Eurolung 2 | Age, sex, ppoFEV1, CAD, CVD, pneumonectomy, thoracotomy, extended resection and BMI |
BMI: body mass index; CAD: coronary artery disease; CKD: chronic kidney disease; CVD: cerebrovascular disease; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.
Missing data were handled by averaging the non-missing numerical ones or choosing the most frequent categorical variables. If >10% of data were missing for a certain variable, it was not used for the analysis.
To develop the score, a multiple regression method using a stepwise backward selection algorithm and Akaike's information criterion to select the model that optimized measures of goodness of fit was used. Predictors from the full model were sequentially eliminated based on the Wald statistical test.
Internal validation was performed by the ten-fold cross-validation technique, calculating discrimination and calibration parameters. Odds ratios were weighted among variables to assign a score to each predictor and risk subgroups were defined attending to their scores and a similar incidence of 30-day mortality or morbidity within the group.
Validation cohort
We used as a validation cohort of a Spanish multicentre database from the Spanish Group of Video Assisted Thoracic Surgery (GEVATS) designed to analyse the effect of videothoracoscopic anatomical lung resections compared to an open approach on patient outcomes [10].
GEVATS database is a prospective multicentre dataset of all anatomical lung resections performed from 20 December 2016 to 20 March 2018 in 33 participant thoracic surgery units (Table 1). Exclusion criteria were age younger than 18 years and simultaneous bilateral procedures.
We excluded those cases simultaneously submitted to the ESTS database during that time period.
The research project was approved by the local ethics committees of all the participating centres, and informed consent was obtained from the recruited patients to use their clinical data for scientific purposes.
Variable definitions were based in the standardization document of Society of Thoracic Surgeons–ESTS [9]. Where there is still no consensus, we have adopted the proposed definition by ESTS Database Committee. All complications were categorized into respiratory, cardiovascular and other complications, in accordance with the Clavien–Dindo severity classification [11].
Postoperative morbidity cases were those occurring within the first 30 days or before discharge, and mortality was recorded at 90 days.
Data were randomly audited. Anonymized discharge reports from 20% of cases from every unit were compared with the database records by the scientific board of the GEVATS group. In addition, the database had filters to exclude implausible values or incompatibility between 2 or more variables.
Missing values from key variables (type of resection, surgical approach and clinical status at discharge) were excluded. Data from centres with <15 reported cases or <10% of eligible patients included during the recruitment period were excluded.
The audit process showed an 83% overall recruitment rate and a 98% data accuracy across all departments, representing ∼50% of all anatomic lung resections performed in Spain during the recruitment period [10].
Statistical analysis
Briefly, calibration represents the agreement between predicted and observed risks. To assess calibration, we used calibration plots, Brier score and Hosmer–Lemeshow test.
The calibration plot represents the observed frequencies versus predicted probabilities. The ‘ideal’ situation is in the diagonal line (predicted probabilities are equal to observed probabilities) [12–14]. The slope of the curve estimates the extremeness of predicted probabilities.
The Brier score measures the total difference between the event (winning) and the forecast probability of that event as an average squared difference and ranges between 0 and 1. A perfect forecaster would have a Brier score of 0 and a perfect misforecaster would have a Brier score of 1 [12–14].
The Hosmer–Lemeshow test divides the cohort in deciles based on predicted values and then compares observed with expected rates, with significance differences (P < 0.05) meaning the lack of fit [12–14].
Besides, we assessed the discrimination of both risk models, or the ability of the model to separate individuals with and without the outcome. Those with the outcome event should have higher predicted risk compared to those without the outcome event. For this purpose, we used the area under the receiver operating characteristics curve (AUC ROC), best if above 0.7 [12–14].
All statistical analyses were performed using Stata/IC v.16 (StataCorp. 2019. Stata Statistical Software: Release 16. College Station, TX: StataCorp LLC).
RESULTS
Eurolung 1 and 2 observed and predicted outcomes in the validation dataset are shown in Tables 3 and 4, respectively.
Deciles . | Probability . | Observed events . | Expected events . | Observed no events . | Expected no events . | Total patients . |
---|---|---|---|---|---|---|
1 | 0.0593 | 11 | 13.5 | 271 | 268.5 | 282 |
2 | 0.0759 | 20 | 19.3 | 262 | 262.7 | 282 |
3 | 0.0912 | 23 | 23.5 | 259 | 258.5 | 282 |
4 | 0.1069 | 21 | 28.0 | 260 | 253.0 | 281 |
5 | 0.1207 | 35 | 32.1 | 247 | 249.9 | 282 |
6 | 0.1383 | 29 | 36.5 | 253 | 245.5 | 282 |
7 | 0.1585 | 38 | 41.6 | 243 | 239.4 | 281 |
8 | 0.1901 | 47 | 49.1 | 236 | 233.9 | 283 |
9 | 0.2431 | 53 | 60.6 | 228 | 220.4 | 281 |
10 | 0.4197 | 69 | 81.3 | 212 | 199.7 | 281 |
Deciles . | Probability . | Observed events . | Expected events . | Observed no events . | Expected no events . | Total patients . |
---|---|---|---|---|---|---|
1 | 0.0593 | 11 | 13.5 | 271 | 268.5 | 282 |
2 | 0.0759 | 20 | 19.3 | 262 | 262.7 | 282 |
3 | 0.0912 | 23 | 23.5 | 259 | 258.5 | 282 |
4 | 0.1069 | 21 | 28.0 | 260 | 253.0 | 281 |
5 | 0.1207 | 35 | 32.1 | 247 | 249.9 | 282 |
6 | 0.1383 | 29 | 36.5 | 253 | 245.5 | 282 |
7 | 0.1585 | 38 | 41.6 | 243 | 239.4 | 281 |
8 | 0.1901 | 47 | 49.1 | 236 | 233.9 | 283 |
9 | 0.2431 | 53 | 60.6 | 228 | 220.4 | 281 |
10 | 0.4197 | 69 | 81.3 | 212 | 199.7 | 281 |
Deciles . | Probability . | Observed events . | Expected events . | Observed no events . | Expected no events . | Total patients . |
---|---|---|---|---|---|---|
1 | 0.0593 | 11 | 13.5 | 271 | 268.5 | 282 |
2 | 0.0759 | 20 | 19.3 | 262 | 262.7 | 282 |
3 | 0.0912 | 23 | 23.5 | 259 | 258.5 | 282 |
4 | 0.1069 | 21 | 28.0 | 260 | 253.0 | 281 |
5 | 0.1207 | 35 | 32.1 | 247 | 249.9 | 282 |
6 | 0.1383 | 29 | 36.5 | 253 | 245.5 | 282 |
7 | 0.1585 | 38 | 41.6 | 243 | 239.4 | 281 |
8 | 0.1901 | 47 | 49.1 | 236 | 233.9 | 283 |
9 | 0.2431 | 53 | 60.6 | 228 | 220.4 | 281 |
10 | 0.4197 | 69 | 81.3 | 212 | 199.7 | 281 |
Deciles . | Probability . | Observed events . | Expected events . | Observed no events . | Expected no events . | Total patients . |
---|---|---|---|---|---|---|
1 | 0.0593 | 11 | 13.5 | 271 | 268.5 | 282 |
2 | 0.0759 | 20 | 19.3 | 262 | 262.7 | 282 |
3 | 0.0912 | 23 | 23.5 | 259 | 258.5 | 282 |
4 | 0.1069 | 21 | 28.0 | 260 | 253.0 | 281 |
5 | 0.1207 | 35 | 32.1 | 247 | 249.9 | 282 |
6 | 0.1383 | 29 | 36.5 | 253 | 245.5 | 282 |
7 | 0.1585 | 38 | 41.6 | 243 | 239.4 | 281 |
8 | 0.1901 | 47 | 49.1 | 236 | 233.9 | 283 |
9 | 0.2431 | 53 | 60.6 | 228 | 220.4 | 281 |
10 | 0.4197 | 69 | 81.3 | 212 | 199.7 | 281 |
Deciles . | Probability . | Observed events . | Expected events . | Observed no events . | Expected no events . | Total patients . |
---|---|---|---|---|---|---|
1 | 0.0033 | 1 | 0.6 | 274 | 274.4 | 275 |
2 | 0.0050 | 1 | 1.1 | 274 | 273.9 | 275 |
3 | 0.0069 | 3 | 1.6 | 272 | 273.4 | 275 |
4 | 0.0089 | 2 | 2.2 | 273 | 272.8 | 275 |
5 | 0.0110 | 3 | 2.7 | 271 | 271.3 | 274 |
6 | 0.0138 | 4 | 3.4 | 271 | 271.6 | 275 |
7 | 0.0171 | 4 | 4.2 | 271 | 270.8 | 275 |
8 | 0.0238 | 10 | 5.5 | 265 | 269.5 | 275 |
9 | 0.0381 | 14 | 8.3 | 261 | 266.7 | 275 |
10 | 0.1680 | 22 | 17.9 | 252 | 256.1 | 274 |
Deciles . | Probability . | Observed events . | Expected events . | Observed no events . | Expected no events . | Total patients . |
---|---|---|---|---|---|---|
1 | 0.0033 | 1 | 0.6 | 274 | 274.4 | 275 |
2 | 0.0050 | 1 | 1.1 | 274 | 273.9 | 275 |
3 | 0.0069 | 3 | 1.6 | 272 | 273.4 | 275 |
4 | 0.0089 | 2 | 2.2 | 273 | 272.8 | 275 |
5 | 0.0110 | 3 | 2.7 | 271 | 271.3 | 274 |
6 | 0.0138 | 4 | 3.4 | 271 | 271.6 | 275 |
7 | 0.0171 | 4 | 4.2 | 271 | 270.8 | 275 |
8 | 0.0238 | 10 | 5.5 | 265 | 269.5 | 275 |
9 | 0.0381 | 14 | 8.3 | 261 | 266.7 | 275 |
10 | 0.1680 | 22 | 17.9 | 252 | 256.1 | 274 |
Deciles . | Probability . | Observed events . | Expected events . | Observed no events . | Expected no events . | Total patients . |
---|---|---|---|---|---|---|
1 | 0.0033 | 1 | 0.6 | 274 | 274.4 | 275 |
2 | 0.0050 | 1 | 1.1 | 274 | 273.9 | 275 |
3 | 0.0069 | 3 | 1.6 | 272 | 273.4 | 275 |
4 | 0.0089 | 2 | 2.2 | 273 | 272.8 | 275 |
5 | 0.0110 | 3 | 2.7 | 271 | 271.3 | 274 |
6 | 0.0138 | 4 | 3.4 | 271 | 271.6 | 275 |
7 | 0.0171 | 4 | 4.2 | 271 | 270.8 | 275 |
8 | 0.0238 | 10 | 5.5 | 265 | 269.5 | 275 |
9 | 0.0381 | 14 | 8.3 | 261 | 266.7 | 275 |
10 | 0.1680 | 22 | 17.9 | 252 | 256.1 | 274 |
Deciles . | Probability . | Observed events . | Expected events . | Observed no events . | Expected no events . | Total patients . |
---|---|---|---|---|---|---|
1 | 0.0033 | 1 | 0.6 | 274 | 274.4 | 275 |
2 | 0.0050 | 1 | 1.1 | 274 | 273.9 | 275 |
3 | 0.0069 | 3 | 1.6 | 272 | 273.4 | 275 |
4 | 0.0089 | 2 | 2.2 | 273 | 272.8 | 275 |
5 | 0.0110 | 3 | 2.7 | 271 | 271.3 | 274 |
6 | 0.0138 | 4 | 3.4 | 271 | 271.6 | 275 |
7 | 0.0171 | 4 | 4.2 | 271 | 270.8 | 275 |
8 | 0.0238 | 10 | 5.5 | 265 | 269.5 | 275 |
9 | 0.0381 | 14 | 8.3 | 261 | 266.7 | 275 |
10 | 0.1680 | 22 | 17.9 | 252 | 256.1 | 274 |
In terms of calibration, for Eurolung 1, there was not much overlapping of the calibration plot (Fig. 1), with a slope of 0.921 and a Brier score of 0.104. The Hosmer–Lemeshow test had a P-value of 0.353. AUC ROC for Eurolung 1 was 0.653 (95% confidence interval, 0.623; 0.684) (Fig. 2).

Calibration plot. Eurolung 1. Measures whether the predicted prevalence is less than (calibration in the large < 0) or greater than the observed prevalence (calibration in the large > 0). AUC: area under the curve; CITL: calibration in the large; E:O: ratio between expected and observed events.

Discrimination. Eurolung 1. ROC: receiver operating characteristics.
In contrast, the slope from the calibration plot in Eurolung 2 was 1.038, with an acceptable overlap, and the Brier score was 0.020. The Hosmer–Lemeshow test for Eurolung 2 had a P-value of 0.234 (Fig. 3). Moreover, AUC ROC for Eurolung 2 was 0.760 (95% confidence interval, 0.701–0.819) (Fig. 4).

Calibration plot. Eurolung 2. Measures whether the predicted prevalence is less than (calibration in the large < 0) or greater than the observed prevalence (calibration in the large > 0). AUC: area under the curve; CITL: calibration in the large: E:O: ratio between expected and observed events.

DISCUSSION
Surgical risk models are very useful when it comes to knowing the probability of developing complications or even postoperative mortality of a given patient. This information is intended to improve information to our patients about the expectations of treatment success, but it can also be used to carry out comparisons or benchmarking between different surgical care units.
There are different strategies for validating a surgical risk score predictor, i.e. the model's ability to accurately predict what it is intended to measure: internal validation, temporal validation and external validation. The first approach employs different techniques to use data from the same design sample of the score (data splitting or cross-validation and bootstrap), the second uses a new prospective cohort from the same centre(s) where the proposed model has been developed and the last involves data from a cohort of patients from centres other than those where the model has been developed. This is the preferred way of validating a surgical risk model when assessing its generalizability [15].
External validation of the mortality risk predictor model (Eurolung 2) yields favourable results. In the calibration plot (Fig. 3), Eurolung 2 underestimates for the highest probabilities (the curve diverges upwards from the dotted line), starting from the last decile, because the prevalence of the event is very low in the sample. However, the values of the statistical calibration tests and the discrimination parameters are good (Fig. 4). Overall, we can say that the external validation for Eurolung 2 is satisfactory.
In the case of the cadiopulmonary morbidity predictor model, the results are not so positive.
In the calibration assessment, the model overestimates (the curve is below the dotted line) (Fig. 1) and the results of the Brier test is suboptimal. We have obtained a Hosmer–Lemeshow test above statistical significance, but this test has been criticized because of its limited statistical power to assess poor calibration and being oversensitive in large samples [12, 16–18]. Moreover, it also gives no information of the direction or magnitude of any miscalibration. For these reasons, it is important to assess model performance with a variety of indicators. In this sense, the AUC ROC also falls below acceptable values (Fig. 2).
There are different arguments for the low generalizability of Eurolung 1.
The ESTS database was voluntary and not prospective, so there may be a not negligible number of missing data, which are usually not missing at random. For the same reason, it is possible that certain patients may have been selected to be included in the database (selection bias), for example not including patients who have developed a perioperative complication, or with a particular type of surgical approach, and therefore the sample may have been biased in some sense [19].
In the same way, the evaluator of the outcome occurrence was not blinded to the determination of the predictor, so there was a risk of using predictors in the assessment of outcomes (incorporation bias). Although this is less important when assessing hard outcomes such as mortality, it is of the utmost importance if there is a certain level of subjective interpretation [20, 21], such as is the case when collecting morbidity information.
Only a minority of the data in the ESTS database were audited [6]; so even though the selected variables and outcomes were clearly defined [4], some coding or data incorporation errors may have occurred.
Importantly, not all key predictor variables for postoperative cardiopulmonary morbidity were included in the model. The absence of predicted postoperative diffusing capacity of carbon monoxide in the ESTS database, an established predictor of postoperative morbidity and mortality [22, 23], is notable, but other potential factors contributing to the occurrence of postoperative complications, such as albumin levels, prophylactic anticoagulation or the use of antiarrhythmic drugs, were also not recorded in the dataset.
Furthermore, differences in the case mix between the study and the validation cohort may also have affected these results. The percentage of subjects operated by minimally invasive approaches was significantly higher in the validation cohort (54% vs 26%), and the study cohort also covered a significantly longer period of time, during which enhanced recovery after surgery protocols, and both anaesthetic and surgical techniques refinements have been widely implemented [24].
There are certain Limitations for this work
The GEVATS database was also a voluntary database, therefore subject to some of the biases that the modelling database faces. However, it has undergone a certain degree of auditing and represents a homogeneous sample of ∼50% of all patients undergoing anatomical lung resection in Spain in a very short period of time [10]. We therefore consider it a valuable cohort of patients to conduct this type of study.
Both samples are quite similar in terms of demographic variables and patient complexity, yet they were intended to analyse different outcomes and cover different sized time spans. Moreover, as has been mentioned previously, in the validation cohort. the proportion of minimally invasive approaches is considerably higher [19]. All of these factors may certainly have affected the results.
Even though the same variable definitions have been used in both datasets, the morbidity-related outcome variables are prone to subjectivity to some degree, which may also be affecting the reported results [25].
The validation cohort size may also affect the results. A low prevalence of individuals with the outcome relative to the number of variables analysed implies a risk of overfitting, and although no clear sample size considerations have been established for external validation studies, a recommendation for a minimum of 100 events and non-events in the validation cohort has been reported [26]. In this case, whilst morbidity and mortality in the GEVATS sample are low, numbers are well above this threshold.
CONCLUSION
In this validation dataset, Eurolung 2 achieved acceptable generalizability parameters, but for Eurolung 1, results did not provide sufficient evidence for generalizability. We therefore suggest a readjustment, update or recalibration of the perioperative cardiopulmonary risk prediction model to improve the applicability to other populations.
Presented at the 29th European Conference on General Thoracic Surgery, 20–22 June 2021. Virtual meeting.
ACKNOWLEDGEMENTS
We thank Johnson & Johnson for their collaboration in the development of the Spanish VATS Group. We also thank all those who are responsible for the clinical documentation services of each hospital for actively participating in the audit of the dataset. The GEVATS members are as follows: Sergio Bolufer (Servicio de Cirugía Torácica, Hospital General Universitario de Alicante, Alicante), Miguel Congregado (Servicio de Cirugía Torácica, Hospital Universitario Virgen Macarena, Sevilla), Marcelo F. Jimenez (Servicio de Cirugía Torácica, Hospital Universitario de Salamanca, Universidad de Salamanca, IBSAL, Salamanca), Borja Aguinagalde (Servicio de Cirugía Torácica, Hospital Universitario de Donostia, San Sebastián-Donostia), Sergio Amor-Alonso (Servicio de Cirugía Torácica, Hospital Universitario Quironsalud Madrid, Madrid), Miguel Jesús Arrarás (Servicio de Cirugía Torácica, Fundación Instituto Valenciano de Oncología, Valencia), Ana Isabel Blanco Orozco (Servicio de Cirugía Torácica, Hospital Universitario Virgen del Rocío, Sevilla), Marc Boada (Servicio de Cirugía Torácica, Hospital Clinic de Barcelona, Instituto Respiratorio, Universidad de Barcelona, Barcelona), Isabel Cal (Servicio de Cirugía Torácica, Hospital Universitario La Princesa, Madrid), Ángel Cilleruelo Ramos (Servicio de Cirugía Torácica, Hospital Clínico Universitario, Valladolid), Elena Fernández-Martín (Servicio de Cirugía Torácica, Hospital Clínico San Carlos, Madrid), Santiago García-Barajas (Servicio de Cirugía Torácica, Hospital Universitario de Badajoz, Badajoz), María Dolores García-Jiménez (Servicio de Cirugía Torácica, Hospital Universitario de Albacete, Albacete), Jose María García-Prim (Servicio de Cirugía Torácica, Hospital Universitario Santiago de Compostela, Santiago de Compostela), José Alberto Garcia-Salcedos (Servicio de Cirugía Torácica, Hospital Universitario 12 de Octubre, Madrid), Juan José Gelbenzu-Zazpe (Servicio de Cirugía Torácica, Complejo Hospitalario de Navarra, Pamplona), Carlos Fernando Giraldo-Ospina (Servicio de Cirugía Torácica, Hospital Regional Universitario, Málaga), María Teresa Gómez Hernández (Servicio de Cirugía Torácica, Hospital Universitario de Salamanca, Universidad de Salamanca, IBSAL, Salamanca), Jorge Hernández (Servicio de Cirugía Torácica, Hospital Universitario Sagrat Cor, Barcelona), Jennifer D. Illana Wolf (Servicio de Cirugía Torácica, Hospital Puerta del Mar, Cádiz), Alberto Jáuregui Abularach (Servicio de Cirugía Torácica, Hospital Universitario Vall d’Hebron, Barcelona), Unai Jiménez (Servicio de Cirugía Torácica, Hospital Universitario Cruces, Bilbao), Iker López Sanz (Servicio de Cirugía Torácica, Hospital Universitario de Donostia, San Sebastián-Donostia), Néstor J. Martínez-Hernández (Servicio de Cirugía Torácica, Hospital Universitario La Ribera, Alcira, Valencia), Elisabeth Martínez-Téllez (Servicio de Cirugía Torácica, Hospital Santa Creu y Sant Pau, Universidad Autónoma de Barcelona, Barcelona), Lucía Milla Collado (Servicio de Cirugía Torácica, Hospital Arnau de Vilanova, Lleida), Roberto Mongil Poce (Servicio de Cirugía Torácica, Hospital Regional Universitario, Málaga), Francisco Javier Moradiellos-Díez (Servicio de Cirugía Torácica, Hospital Universitario Quironsalud Madrid, Madrid), Ramón Moreno-Basalobre (Servicio de Cirugía Torácica, Hospital Universitario La Princesa, Madrid), Sergio B. Moreno Merino (Servicio de Cirugía Torácica, Hospital Universitario Virgen Macarena, Sevilla), Florencio Quero-Valenzuela (Servicio de Cirugía Torácica, Hospital Virgen de las Nieves, Granada), María Elena Ramírez-Gil (Servicio de Cirugía Torácica, Complejo Hospitalario de Navarra, Pamplona), Ricard Ramos-Izquierdo (Servicio de Cirugía Torácica, Hospital Universitario de Bellvitge, Hospitalet de Llobregat, Barcelona), Eduardo Rivo (Servicio de Cirugía Torácica, Hospital Universitario Santiago de Compostela, Santiago de Compostela), Alberto Rodríguez-Fuster (Servicio de Cirugía Torácica, Hospital del Mar. IMIM (Instituto de Investigación Médica Hospital del Mar), Barcelona), Rafael Rojo-Marcos (Servicio de Cirugía Torácica, Hospital Universitario Cruces, Bilbao), David Sanchez-Lorente (Servicio de Cirugía Torácica, Hospital Clinic de Barcelona, Instituto Respiratorio, Universidad de Barcelona, Barcelona), Laura Sánchez Moreno (Servicio de Cirugía Torácica, Hospital Universitario Marqués de Valdecilla, Santander), Carlos Simón (Servicio de Cirugía Torácica, Hospital Universitario Gregorio Marañón, Madrid), Juan Carlos Trujillo-Reyes (Servicio de Cirugía Torácica, Hospital Santa Creu y Sant Pau, Universidad Autónoma de Barcelona, Barcelona), Cipriano López García (Servicio de Cirugía Torácica, Hospital Universitario de Badajoz, Badajoz), Juan José Fibla Alfara (Servicio de Cirugía Torácica, Hospital Universitario Sagrat Cor, Barcelona), Julio Sesma Romero (Servicio de Cirugía Torácica, Hospital General Universitario de Alicante, Alicante) and Florentino Hernando Trancho (Servicio de Cirugía Torácica, Hospital Clínico San Carlos, Madrid).
Funding
All costs related to the start-up and maintenance of the GEVATS database were covered by Ethicon, Johnson & Johnson. The authors had freedom of investigation and full control of the design of the study, methods used, outcome parameters and results, data analysis and production of the written report. The GEVATS was awarded a grant from the Spanish Society of Thoracic Surgery as the best national research project of 2015.
Conflict of interest: none declared.
Data Availability Statement
All relevant data are within the manuscript and its supporting information files.
Author contributions
David Gomez de Antonio: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Supervision; Validation; Writing—original draft; Writing—review & editing. Silvana Crowley Carrasco: Writing—review & editing. Alejandra Romero Román: Writing—review & editing. Ana Royuela: Data curation; Formal analysis; Methodology; Writing—review & editing. Mariana Gil Barturen: Writing—review & editing. Carme Obiols: Writing—review & editing. Sergi Call: Writing—review & editing. Raúl Embún: Writing—review & editing. Ínigo Royo: Writing—review & editing. José Luis Recuero: Writing—review & editing. Alberto Cabanero: Writing—review & editing. Nicolás Moreno: Writing—review & editing. Sergio Bolufer: Data collection. Miguel Congregado: Data collection. Marcelo F. Jimenez: Data collection. Borja Aguinagalde: Data collection. Sergio Amor: Data collection. Miguel Jesús Arrarás: Data collection. Ana Isabel Blanco Orozco: Data collection. Marc Boada: Data collection. Isabel Cal: Data collection. Ángel Cilleruelo Ramos: Data collection. Elena Fernández-Martín: Data collection. Santiago García-Barajas: Data collection. María Dolores García-Jiménez: Data collection. Jose María García-Prim: Data collection. José Alberto Garcia-Salcedos: Data collection. Juan José Gelbenzu-Zazpe: Data collection. Carlos Fernando Giraldo-Ospina: Data collection. María Teresa Gómez Hernández: Data collection. Jorge Hernández: Data collection. Jennifer D. Illana Wolf: Data collection. Alberto Jáuregui Abularach: Data collection. Unai Jiménez: Data collection. Iker López Sanz: Data collection. Néstor J. Martínez-Hernández: Data collection. Elisabeth Martínez-Téllez: Data collection. Lucia Milla Collado: Data collection. Roberto Mongil Poce: Data collection. Francisco Javier Moradiellos-Díez: Data collection. Ramón Moreno-Basalobre: Data collection. Sergio B. Moreno Merino: Data collection. Florencio Quero-Valenzuela: Data collection. María Elena Ramírez-Gil: Data collection. Ricard Ramos-Izquierdo: Data collection. Eduardo Rivo: Data collection. Alberto Rodríguez-Fuster: Data collection. Rafael Rojo-Marcos: Data collection. David Sanchez-Lorente: Data collection. Laura Sánchez Moreno: Data collection. Carlos Simón: Data collection. Juan Carlos Trujillo-Reyes: Data collection. Cipriano López García: Data collection. Juan José Fibla Alfara: Data collection. Julio Sesma Romero: Data collection. Florentino Hernando Trancho: Data collection.
Reviewer information
European Journal of Cardio-Thoracic Surgery thanks Alessandro Brunelli, Alexander Wahba and the other, anonymous reviewer(s) for their contribution to the peer review process of this article.
REFERENCES
ABBREVIATIONS
- AUC ROC
Area under the receiver operating characteristics curve
- ESTS
European Society for Thoracic Surgery
- GEVATS
Spanish Group of Video-assisted Thoracic Surgery
- ppoFEV1
Predicted postoperative forced expiratory volume in 1 s
Author notes
GEVATS members are listed in the Acknowledgements section.