External validation of the European Society of Thoracic Surgeons morbidity and mortality risk models

Gómez de Antonio, David; Crowley Carrasco, Silvana; Romero Román, Alejandra; Royuela, Ana; Gil Barturen, Mariana; Obiols, Carme; Call, Sergi; Royo, Ínigo; Recuero, José Luis; Cabanero, Alberto; Moreno, Nicolás; Embún, Raúl; Spanish Group of Video Assisted Thoracic Surgery (GEVATS); Bolufer, Sergio; Congregado, Miguel; Jimenez, Marcelo F; Aguinagalde, Borja; Amor-Alonso, Sergio; Arrarás, Miguel Jesús; Orozco, Ana Isabel Blanco; Boada, Marc; Cal, Isabel; Ramos, Ángel Cilleruelo; Fernández-Martín, Elena; García-Barajas, Santiago; García-Jiménez, María Dolores; García-Prim, Jose María; Garcia-Salcedos, José Alberto; Gelbenzu-Zazpe, Juan José; Giraldo-Ospina, Carlos Fernando; Hernández, María Teresa Gómez; Hernández, Jorge; Wolf, Jennifer D Illana; Abularach, Alberto Jáuregui; Jiménez, Unai; Sanz, Iker López; Martínez-Hernández, Néstor J; Martínez-Téllez, Elisabeth; Collado, Lucía Milla; Poce, Roberto Mongil; Moradiellos-Díez, Francisco Javier; Moreno-Basalobre, Ramón; Merino, Sergio B Moreno; Quero-Valenzuela, Florencio; Ramírez-Gil, María Elena; Ramos-Izquierdo, Ricard; Rivo, Eduardo; Rodríguez-Fuster, Alberto; Rojo-Marcos, Rafael; Sanchez-Lorente, David; Moreno, Laura Sánchez; Simón, Carlos; Trujillo-Reyes, Juan Carlos; García, Cipriano López; Alfara, Juan José Fibla; Romero, Julio Sesma; Trancho, Florentino Hernando

doi:10.1093/ejcts/ezac170

Abstract

OBJECTIVES

There is a wide variety of predictive models of postoperative risk, although some of them are specific to thoracic surgery, none of them is widely used. The European Society for Thoracic Surgery has recently updated its models of cardiopulmonary morbidity (Eurolung 1) and 30-day mortality (Eurolung 2) after anatomic lung resection. The aim of our work is to carry out the external validation of both models in a multicentre national database.

METHODS

External validation of Eurolung 1 and Eurolung 2 was evaluated through calibration (calibration plot, Brier score and Hosmer–Lemeshow test) and discrimination [area under receiver operating characteristic curves (AUC ROC)], on a national multicentre database of 2858 patients undergoing anatomic lung resection between 2016 and 2018.

RESULTS

For Eurolung 1, calibration plot showed suboptimal overlapping (slope = 0.921) and a Hosmer–Lemeshow test and Brier score of P = 0.353 and 0.104, respectively. In terms of discrimination, AUC ROC for Eurolung 1 was 0.653 (95% confidence interval, 0.623–0.684). In contrast, Eurolung 2 showed a good calibration (slope = 1.038) and a Hosmer–Lemeshow test and Brier score of P = 0.234 and 0.020, respectively. AUC ROC for Eurolung 2 was 0.760 (95% confidence interval, 0.701–0.819).

CONCLUSIONS

Thirty-day mortality score (Eurolung 2) seems to be transportable to other anatomic lung-resected patients. On the other hand, postoperative cardiopulmonary morbidity score (Eurolung 1) seems not to have sufficient generalizability for new patients.

Risk model, External validation, Morbidity, Mortality, Eurolung, Anatomic lung resection, Thoracic surgery, Generalization

INTRODUCTION

Surgical risk prediction models are an invaluable tool for assessing perioperative results, counselling patients and benchmarking [1–4].

Although there are a number of risk models for thoracic surgery, their use in the clinical practice is far from widespread [5–9].

In 2017, the European Society for Thoracic Surgeons (ESTS) Database Committee published the first risk prediction models (Eurolung 1 for cardiopulmonary morbidity and Eurolung 2 for 30-day mortality) after anatomical lung resection, based on data from around 40 000 patients [5].

A recent update with a more parsimonious model has recently been published, retaining 5 variables for Eurolung 1 [age > 70, male sex, predicted postoperative forced expiratory volume in 1 s (ppoFEV1) < 70%, extended resection and open approach] and 6 for Eurolung 2 (age > 70, male sex, ppoFEV1 < 70%, open approach, body mass index < 18.5 and pneumonectomy) [6].

If these scores are to be extensively used by the community of thoracic surgeons, they should be externally validated in a different dataset, which is the scope of our work.

MATERIALS AND METHODS

Ethics statement

The project has been approved by the respective research ethics committees of all participating centres. All patients signed a specific informed consent for the use of their clinical data for scientific purposes.

Modelling cohort

ESTS dataset (Table 1) contained information from 82 383 anatomic lung resections performed from 2007 to 2018 in over 200 European contributing units [6], using standardized definitions of variables and outcomes agreed upon in a joint Society of Thoracic Surgeons–ESTS publication in 2015 [4].

Table 1:

Open in new tab

European Society for Thoracic Surgery and Spanish Group of Video Assisted Thoracic Surgery databases comparison

	ESTS	GEVATS
Time period	2007–2018	2016–2018
N	82 383	2858
Age (years)	64.6 (57.6–71.2)	66 (59–72)
Male gender	53 780 (65)	2015 (70)
BMI (kg/m²)	25.1 (22.4–28.3)	26.5 (23.7–29.6)
CVD	2434 (3)	148 (5)
CAD	6725 (8.2)	253 (7.9)
Thoracotomy	61 252 (74)	1313 (46)
ppoFEV1 (%)	73 (59–87)	69 (57–82)
Extended resections	4722 (5.7)	144 (5)
CKD	4579 (5.6)	72 (2.5)
Cardiopulmonary morbidity (%)	15.7	12.4
30-Day mortality (%)	2.2	2.3

	ESTS	GEVATS
Time period	2007–2018	2016–2018
N	82 383	2858
Age (years)	64.6 (57.6–71.2)	66 (59–72)
Male gender	53 780 (65)	2015 (70)
BMI (kg/m²)	25.1 (22.4–28.3)	26.5 (23.7–29.6)
CVD	2434 (3)	148 (5)
CAD	6725 (8.2)	253 (7.9)
Thoracotomy	61 252 (74)	1313 (46)
ppoFEV1 (%)	73 (59–87)	69 (57–82)
Extended resections	4722 (5.7)	144 (5)
CKD	4579 (5.6)	72 (2.5)
Cardiopulmonary morbidity (%)	15.7	12.4
30-Day mortality (%)	2.2	2.3

Results are expressed as median and interquartile range for numeric variables and as count and percentage of the total for categorical variables.

BMI: body mass index; CAD: coronary artery disease; CKD: chronic kidney disease; CVD: cerebrovascular disease; ESTS: European Society for Thoracic Surgery; GEVATS: Spanish Group of Video Assisted Thoracic Surgery; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

Table 1:

Open in new tab

European Society for Thoracic Surgery and Spanish Group of Video Assisted Thoracic Surgery databases comparison

	ESTS	GEVATS
Time period	2007–2018	2016–2018
N	82 383	2858
Age (years)	64.6 (57.6–71.2)	66 (59–72)
Male gender	53 780 (65)	2015 (70)
BMI (kg/m²)	25.1 (22.4–28.3)	26.5 (23.7–29.6)
CVD	2434 (3)	148 (5)
CAD	6725 (8.2)	253 (7.9)
Thoracotomy	61 252 (74)	1313 (46)
ppoFEV1 (%)	73 (59–87)	69 (57–82)
Extended resections	4722 (5.7)	144 (5)
CKD	4579 (5.6)	72 (2.5)
Cardiopulmonary morbidity (%)	15.7	12.4
30-Day mortality (%)	2.2	2.3

	ESTS	GEVATS
Time period	2007–2018	2016–2018
N	82 383	2858
Age (years)	64.6 (57.6–71.2)	66 (59–72)
Male gender	53 780 (65)	2015 (70)
BMI (kg/m²)	25.1 (22.4–28.3)	26.5 (23.7–29.6)
CVD	2434 (3)	148 (5)
CAD	6725 (8.2)	253 (7.9)
Thoracotomy	61 252 (74)	1313 (46)
ppoFEV1 (%)	73 (59–87)	69 (57–82)
Extended resections	4722 (5.7)	144 (5)
CKD	4579 (5.6)	72 (2.5)
Cardiopulmonary morbidity (%)	15.7	12.4
30-Day mortality (%)	2.2	2.3

Results are expressed as median and interquartile range for numeric variables and as count and percentage of the total for categorical variables.

BMI: body mass index; CAD: coronary artery disease; CKD: chronic kidney disease; CVD: cerebrovascular disease; ESTS: European Society for Thoracic Surgery; GEVATS: Spanish Group of Video Assisted Thoracic Surgery; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

The following cardiopulmonary complications collected in the ESTS database were listed: respiratory failure, need for reintubation, prolonged mechanical ventilation, pneumonia, atelectasis requiring bronchoscopy, pulmonary oedema, pulmonary embolism, acute respiratory distress syndrome/acute lung injury, arrhythmia requiring treatment, acute myocardial ischaemia, acute heart failure, transient ischaemic stroke/stroke and acute kidney injury [5].

Mortality was defined as death in the hospital or within 30 days of surgery if the patient was discharged [5].

Non-anatomical lung resections or cases without information regarding mortality or perioperative complications were excluded.

Random samples from ESTS Institutional Accreditation Programme eligible units were audited.

For updating Eurolung 1 and 2 risk models, only predictors associated with the outcomes in the previous version of the model [5] were included (Table 2).

Table 2:

Open in new tab

Variables included to update Eurolung models

Eurolung 1

Age, sex, ppoFEV1, CAD, CVD, CKD, thoracotomy and extended resection

Eurolung 2

Age, sex, ppoFEV1, CAD, CVD, pneumonectomy, thoracotomy, extended resection and BMI

BMI: body mass index; CAD: coronary artery disease; CKD: chronic kidney disease; CVD: cerebrovascular disease; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

Missing data were handled by averaging the non-missing numerical ones or choosing the most frequent categorical variables. If >10% of data were missing for a certain variable, it was not used for the analysis.

To develop the score, a multiple regression method using a stepwise backward selection algorithm and Akaike's information criterion to select the model that optimized measures of goodness of fit was used. Predictors from the full model were sequentially eliminated based on the Wald statistical test.

Internal validation was performed by the ten-fold cross-validation technique, calculating discrimination and calibration parameters. Odds ratios were weighted among variables to assign a score to each predictor and risk subgroups were defined attending to their scores and a similar incidence of 30-day mortality or morbidity within the group.

Validation cohort

We used as a validation cohort of a Spanish multicentre database from the Spanish Group of Video Assisted Thoracic Surgery (GEVATS) designed to analyse the effect of videothoracoscopic anatomical lung resections compared to an open approach on patient outcomes [10].

GEVATS database is a prospective multicentre dataset of all anatomical lung resections performed from 20 December 2016 to 20 March 2018 in 33 participant thoracic surgery units (Table 1). Exclusion criteria were age younger than 18 years and simultaneous bilateral procedures.

We excluded those cases simultaneously submitted to the ESTS database during that time period.

The research project was approved by the local ethics committees of all the participating centres, and informed consent was obtained from the recruited patients to use their clinical data for scientific purposes.

Variable definitions were based in the standardization document of Society of Thoracic Surgeons–ESTS [9]. Where there is still no consensus, we have adopted the proposed definition by ESTS Database Committee. All complications were categorized into respiratory, cardiovascular and other complications, in accordance with the Clavien–Dindo severity classification [11].

Postoperative morbidity cases were those occurring within the first 30 days or before discharge, and mortality was recorded at 90 days.

Data were randomly audited. Anonymized discharge reports from 20% of cases from every unit were compared with the database records by the scientific board of the GEVATS group. In addition, the database had filters to exclude implausible values or incompatibility between 2 or more variables.

Missing values from key variables (type of resection, surgical approach and clinical status at discharge) were excluded. Data from centres with <15 reported cases or <10% of eligible patients included during the recruitment period were excluded.

The audit process showed an 83% overall recruitment rate and a 98% data accuracy across all departments, representing ∼50% of all anatomic lung resections performed in Spain during the recruitment period [10].

Statistical analysis

To external validate Eurolung 1 and 2 scores, we used the published coefficients for both scores [6] to assess the calibration and discrimination [11–13]:

logit_EUROLUNG 1 = - 2.852 + 0.021 \times age + 0.472 \times male - 0.015 \times ppoFEV 1 + 0.662 \times thoracotomy + 0.324 \times extended resection

logit_EUROLUNG 2_simp = - 6.350 + 0.047 \times age + 0.889 \times male - 0.055 \times body mass index - 0.010 \times ppoFEV 1 + 0.892 \times thoracotomy + 0.983 \times pneumonectomy .

Briefly, calibration represents the agreement between predicted and observed risks. To assess calibration, we used calibration plots, Brier score and Hosmer–Lemeshow test.

The calibration plot represents the observed frequencies versus predicted probabilities. The ‘ideal’ situation is in the diagonal line (predicted probabilities are equal to observed probabilities) [12–14]. The slope of the curve estimates the extremeness of predicted probabilities.

The Brier score measures the total difference between the event (winning) and the forecast probability of that event as an average squared difference and ranges between 0 and 1. A perfect forecaster would have a Brier score of 0 and a perfect misforecaster would have a Brier score of 1 [12–14].

The Hosmer–Lemeshow test divides the cohort in deciles based on predicted values and then compares observed with expected rates, with significance differences (P < 0.05) meaning the lack of fit [12–14].

Besides, we assessed the discrimination of both risk models, or the ability of the model to separate individuals with and without the outcome. Those with the outcome event should have higher predicted risk compared to those without the outcome event. For this purpose, we used the area under the receiver operating characteristics curve (AUC ROC), best if above 0.7 [12–14].

All statistical analyses were performed using Stata/IC v.16 (StataCorp. 2019. Stata Statistical Software: Release 16. College Station, TX: StataCorp LLC).

RESULTS

Eurolung 1 and 2 observed and predicted outcomes in the validation dataset are shown in Tables 3 and 4, respectively.

Table 3:

Open in new tab

Eurolung 1 observed and predicted outcomes in the validation cohort

Deciles	Probability	Observed events	Expected events	Observed no events	Expected no events	Total patients
1	0.0593	11	13.5	271	268.5	282
2	0.0759	20	19.3	262	262.7	282
3	0.0912	23	23.5	259	258.5	282
4	0.1069	21	28.0	260	253.0	281
5	0.1207	35	32.1	247	249.9	282
6	0.1383	29	36.5	253	245.5	282
7	0.1585	38	41.6	243	239.4	281
8	0.1901	47	49.1	236	233.9	283
9	0.2431	53	60.6	228	220.4	281
10	0.4197	69	81.3	212	199.7	281

Deciles	Probability	Observed events	Expected events	Observed no events	Expected no events	Total patients
1	0.0593	11	13.5	271	268.5	282
2	0.0759	20	19.3	262	262.7	282
3	0.0912	23	23.5	259	258.5	282
4	0.1069	21	28.0	260	253.0	281
5	0.1207	35	32.1	247	249.9	282
6	0.1383	29	36.5	253	245.5	282
7	0.1585	38	41.6	243	239.4	281
8	0.1901	47	49.1	236	233.9	283
9	0.2431	53	60.6	228	220.4	281
10	0.4197	69	81.3	212	199.7	281

Table 3:

Open in new tab

Eurolung 1 observed and predicted outcomes in the validation cohort

Deciles	Probability	Observed events	Expected events	Observed no events	Expected no events	Total patients
1	0.0593	11	13.5	271	268.5	282
2	0.0759	20	19.3	262	262.7	282
3	0.0912	23	23.5	259	258.5	282
4	0.1069	21	28.0	260	253.0	281
5	0.1207	35	32.1	247	249.9	282
6	0.1383	29	36.5	253	245.5	282
7	0.1585	38	41.6	243	239.4	281
8	0.1901	47	49.1	236	233.9	283
9	0.2431	53	60.6	228	220.4	281
10	0.4197	69	81.3	212	199.7	281

Deciles	Probability	Observed events	Expected events	Observed no events	Expected no events	Total patients
1	0.0593	11	13.5	271	268.5	282
2	0.0759	20	19.3	262	262.7	282
3	0.0912	23	23.5	259	258.5	282
4	0.1069	21	28.0	260	253.0	281
5	0.1207	35	32.1	247	249.9	282
6	0.1383	29	36.5	253	245.5	282
7	0.1585	38	41.6	243	239.4	281
8	0.1901	47	49.1	236	233.9	283
9	0.2431	53	60.6	228	220.4	281
10	0.4197	69	81.3	212	199.7	281

Table 4:

Open in new tab

Eurolung 2 observed and predicted outcomes in the validation cohort

Deciles	Probability	Observed events	Expected events	Observed no events	Expected no events	Total patients
1	0.0033	1	0.6	274	274.4	275
2	0.0050	1	1.1	274	273.9	275
3	0.0069	3	1.6	272	273.4	275
4	0.0089	2	2.2	273	272.8	275
5	0.0110	3	2.7	271	271.3	274
6	0.0138	4	3.4	271	271.6	275
7	0.0171	4	4.2	271	270.8	275
8	0.0238	10	5.5	265	269.5	275
9	0.0381	14	8.3	261	266.7	275
10	0.1680	22	17.9	252	256.1	274

Deciles	Probability	Observed events	Expected events	Observed no events	Expected no events	Total patients
1	0.0033	1	0.6	274	274.4	275
2	0.0050	1	1.1	274	273.9	275
3	0.0069	3	1.6	272	273.4	275
4	0.0089	2	2.2	273	272.8	275
5	0.0110	3	2.7	271	271.3	274
6	0.0138	4	3.4	271	271.6	275
7	0.0171	4	4.2	271	270.8	275
8	0.0238	10	5.5	265	269.5	275
9	0.0381	14	8.3	261	266.7	275
10	0.1680	22	17.9	252	256.1	274

Table 4:

Open in new tab

Eurolung 2 observed and predicted outcomes in the validation cohort

Deciles	Probability	Observed events	Expected events	Observed no events	Expected no events	Total patients
1	0.0033	1	0.6	274	274.4	275
2	0.0050	1	1.1	274	273.9	275
3	0.0069	3	1.6	272	273.4	275
4	0.0089	2	2.2	273	272.8	275
5	0.0110	3	2.7	271	271.3	274
6	0.0138	4	3.4	271	271.6	275
7	0.0171	4	4.2	271	270.8	275
8	0.0238	10	5.5	265	269.5	275
9	0.0381	14	8.3	261	266.7	275
10	0.1680	22	17.9	252	256.1	274

Deciles	Probability	Observed events	Expected events	Observed no events	Expected no events	Total patients
1	0.0033	1	0.6	274	274.4	275
2	0.0050	1	1.1	274	273.9	275
3	0.0069	3	1.6	272	273.4	275
4	0.0089	2	2.2	273	272.8	275
5	0.0110	3	2.7	271	271.3	274
6	0.0138	4	3.4	271	271.6	275
7	0.0171	4	4.2	271	270.8	275
8	0.0238	10	5.5	265	269.5	275
9	0.0381	14	8.3	261	266.7	275
10	0.1680	22	17.9	252	256.1	274

In terms of calibration, for Eurolung 1, there was not much overlapping of the calibration plot (Fig. 1), with a slope of 0.921 and a Brier score of 0.104. The Hosmer–Lemeshow test had a P-value of 0.353. AUC ROC for Eurolung 1 was 0.653 (95% confidence interval, 0.623; 0.684) (Fig. 2).

Figure 1:

Calibration plot. Eurolung 1. Measures whether the predicted prevalence is less than (calibration in the large < 0) or greater than the observed prevalence (calibration in the large > 0). AUC: area under the curve; CITL: calibration in the large; E:O: ratio between expected and observed events.

Open in new tab Download slide

Figure 2:

Discrimination. Eurolung 1. ROC: receiver operating characteristics.

Open in new tab Download slide

In contrast, the slope from the calibration plot in Eurolung 2 was 1.038, with an acceptable overlap, and the Brier score was 0.020. The Hosmer–Lemeshow test for Eurolung 2 had a P-value of 0.234 (Fig. 3). Moreover, AUC ROC for Eurolung 2 was 0.760 (95% confidence interval, 0.701–0.819) (Fig. 4).

Figure 3:

Calibration plot. Eurolung 2. Measures whether the predicted prevalence is less than (calibration in the large < 0) or greater than the observed prevalence (calibration in the large > 0). AUC: area under the curve; CITL: calibration in the large: E:O: ratio between expected and observed events.

Open in new tab Download slide

Figure 4:

Discrimination. Eurolung 2.

Open in new tab Download slide

DISCUSSION

Surgical risk models are very useful when it comes to knowing the probability of developing complications or even postoperative mortality of a given patient. This information is intended to improve information to our patients about the expectations of treatment success, but it can also be used to carry out comparisons or benchmarking between different surgical care units.

There are different strategies for validating a surgical risk score predictor, i.e. the model's ability to accurately predict what it is intended to measure: internal validation, temporal validation and external validation. The first approach employs different techniques to use data from the same design sample of the score (data splitting or cross-validation and bootstrap), the second uses a new prospective cohort from the same centre(s) where the proposed model has been developed and the last involves data from a cohort of patients from centres other than those where the model has been developed. This is the preferred way of validating a surgical risk model when assessing its generalizability [15].

External validation of the mortality risk predictor model (Eurolung 2) yields favourable results. In the calibration plot (Fig. 3), Eurolung 2 underestimates for the highest probabilities (the curve diverges upwards from the dotted line), starting from the last decile, because the prevalence of the event is very low in the sample. However, the values of the statistical calibration tests and the discrimination parameters are good (Fig. 4). Overall, we can say that the external validation for Eurolung 2 is satisfactory.

In the case of the cadiopulmonary morbidity predictor model, the results are not so positive.

In the calibration assessment, the model overestimates (the curve is below the dotted line) (Fig. 1) and the results of the Brier test is suboptimal. We have obtained a Hosmer–Lemeshow test above statistical significance, but this test has been criticized because of its limited statistical power to assess poor calibration and being oversensitive in large samples [12, 16–18]. Moreover, it also gives no information of the direction or magnitude of any miscalibration. For these reasons, it is important to assess model performance with a variety of indicators. In this sense, the AUC ROC also falls below acceptable values (Fig. 2).

There are different arguments for the low generalizability of Eurolung 1.

The ESTS database was voluntary and not prospective, so there may be a not negligible number of missing data, which are usually not missing at random. For the same reason, it is possible that certain patients may have been selected to be included in the database (selection bias), for example not including patients who have developed a perioperative complication, or with a particular type of surgical approach, and therefore the sample may have been biased in some sense [19].

In the same way, the evaluator of the outcome occurrence was not blinded to the determination of the predictor, so there was a risk of using predictors in the assessment of outcomes (incorporation bias). Although this is less important when assessing hard outcomes such as mortality, it is of the utmost importance if there is a certain level of subjective interpretation [20, 21], such as is the case when collecting morbidity information.

Only a minority of the data in the ESTS database were audited [6]; so even though the selected variables and outcomes were clearly defined [4], some coding or data incorporation errors may have occurred.

Importantly, not all key predictor variables for postoperative cardiopulmonary morbidity were included in the model. The absence of predicted postoperative diffusing capacity of carbon monoxide in the ESTS database, an established predictor of postoperative morbidity and mortality [22, 23], is notable, but other potential factors contributing to the occurrence of postoperative complications, such as albumin levels, prophylactic anticoagulation or the use of antiarrhythmic drugs, were also not recorded in the dataset.

Furthermore, differences in the case mix between the study and the validation cohort may also have affected these results. The percentage of subjects operated by minimally invasive approaches was significantly higher in the validation cohort (54% vs 26%), and the study cohort also covered a significantly longer period of time, during which enhanced recovery after surgery protocols, and both anaesthetic and surgical techniques refinements have been widely implemented [24].

There are certain Limitations for this work

The GEVATS database was also a voluntary database, therefore subject to some of the biases that the modelling database faces. However, it has undergone a certain degree of auditing and represents a homogeneous sample of ∼50% of all patients undergoing anatomical lung resection in Spain in a very short period of time [10]. We therefore consider it a valuable cohort of patients to conduct this type of study.

Both samples are quite similar in terms of demographic variables and patient complexity, yet they were intended to analyse different outcomes and cover different sized time spans. Moreover, as has been mentioned previously, in the validation cohort. the proportion of minimally invasive approaches is considerably higher [19]. All of these factors may certainly have affected the results.

Even though the same variable definitions have been used in both datasets, the morbidity-related outcome variables are prone to subjectivity to some degree, which may also be affecting the reported results [25].

The validation cohort size may also affect the results. A low prevalence of individuals with the outcome relative to the number of variables analysed implies a risk of overfitting, and although no clear sample size considerations have been established for external validation studies, a recommendation for a minimum of 100 events and non-events in the validation cohort has been reported [26]. In this case, whilst morbidity and mortality in the GEVATS sample are low, numbers are well above this threshold.

CONCLUSION

In this validation dataset, Eurolung 2 achieved acceptable generalizability parameters, but for Eurolung 1, results did not provide sufficient evidence for generalizability. We therefore suggest a readjustment, update or recalibration of the perioperative cardiopulmonary risk prediction model to improve the applicability to other populations.

Presented at the 29th European Conference on General Thoracic Surgery, 20–22 June 2021. Virtual meeting.

ACKNOWLEDGEMENTS

We thank Johnson & Johnson for their collaboration in the development of the Spanish VATS Group. We also thank all those who are responsible for the clinical documentation services of each hospital for actively participating in the audit of the dataset. The GEVATS members are as follows: Sergio Bolufer (Servicio de Cirugía Torácica, Hospital General Universitario de Alicante, Alicante), Miguel Congregado (Servicio de Cirugía Torácica, Hospital Universitario Virgen Macarena, Sevilla), Marcelo F. Jimenez (Servicio de Cirugía Torácica, Hospital Universitario de Salamanca, Universidad de Salamanca, IBSAL, Salamanca), Borja Aguinagalde (Servicio de Cirugía Torácica, Hospital Universitario de Donostia, San Sebastián-Donostia), Sergio Amor-Alonso (Servicio de Cirugía Torácica, Hospital Universitario Quironsalud Madrid, Madrid), Miguel Jesús Arrarás (Servicio de Cirugía Torácica, Fundación Instituto Valenciano de Oncología, Valencia), Ana Isabel Blanco Orozco (Servicio de Cirugía Torácica, Hospital Universitario Virgen del Rocío, Sevilla), Marc Boada (Servicio de Cirugía Torácica, Hospital Clinic de Barcelona, Instituto Respiratorio, Universidad de Barcelona, Barcelona), Isabel Cal (Servicio de Cirugía Torácica, Hospital Universitario La Princesa, Madrid), Ángel Cilleruelo Ramos (Servicio de Cirugía Torácica, Hospital Clínico Universitario, Valladolid), Elena Fernández-Martín (Servicio de Cirugía Torácica, Hospital Clínico San Carlos, Madrid), Santiago García-Barajas (Servicio de Cirugía Torácica, Hospital Universitario de Badajoz, Badajoz), María Dolores García-Jiménez (Servicio de Cirugía Torácica, Hospital Universitario de Albacete, Albacete), Jose María García-Prim (Servicio de Cirugía Torácica, Hospital Universitario Santiago de Compostela, Santiago de Compostela), José Alberto Garcia-Salcedos (Servicio de Cirugía Torácica, Hospital Universitario 12 de Octubre, Madrid), Juan José Gelbenzu-Zazpe (Servicio de Cirugía Torácica, Complejo Hospitalario de Navarra, Pamplona), Carlos Fernando Giraldo-Ospina (Servicio de Cirugía Torácica, Hospital Regional Universitario, Málaga), María Teresa Gómez Hernández (Servicio de Cirugía Torácica, Hospital Universitario de Salamanca, Universidad de Salamanca, IBSAL, Salamanca), Jorge Hernández (Servicio de Cirugía Torácica, Hospital Universitario Sagrat Cor, Barcelona), Jennifer D. Illana Wolf (Servicio de Cirugía Torácica, Hospital Puerta del Mar, Cádiz), Alberto Jáuregui Abularach (Servicio de Cirugía Torácica, Hospital Universitario Vall d’Hebron, Barcelona), Unai Jiménez (Servicio de Cirugía Torácica, Hospital Universitario Cruces, Bilbao), Iker López Sanz (Servicio de Cirugía Torácica, Hospital Universitario de Donostia, San Sebastián-Donostia), Néstor J. Martínez-Hernández (Servicio de Cirugía Torácica, Hospital Universitario La Ribera, Alcira, Valencia), Elisabeth Martínez-Téllez (Servicio de Cirugía Torácica, Hospital Santa Creu y Sant Pau, Universidad Autónoma de Barcelona, Barcelona), Lucía Milla Collado (Servicio de Cirugía Torácica, Hospital Arnau de Vilanova, Lleida), Roberto Mongil Poce (Servicio de Cirugía Torácica, Hospital Regional Universitario, Málaga), Francisco Javier Moradiellos-Díez (Servicio de Cirugía Torácica, Hospital Universitario Quironsalud Madrid, Madrid), Ramón Moreno-Basalobre (Servicio de Cirugía Torácica, Hospital Universitario La Princesa, Madrid), Sergio B. Moreno Merino (Servicio de Cirugía Torácica, Hospital Universitario Virgen Macarena, Sevilla), Florencio Quero-Valenzuela (Servicio de Cirugía Torácica, Hospital Virgen de las Nieves, Granada), María Elena Ramírez-Gil (Servicio de Cirugía Torácica, Complejo Hospitalario de Navarra, Pamplona), Ricard Ramos-Izquierdo (Servicio de Cirugía Torácica, Hospital Universitario de Bellvitge, Hospitalet de Llobregat, Barcelona), Eduardo Rivo (Servicio de Cirugía Torácica, Hospital Universitario Santiago de Compostela, Santiago de Compostela), Alberto Rodríguez-Fuster (Servicio de Cirugía Torácica, Hospital del Mar. IMIM (Instituto de Investigación Médica Hospital del Mar), Barcelona), Rafael Rojo-Marcos (Servicio de Cirugía Torácica, Hospital Universitario Cruces, Bilbao), David Sanchez-Lorente (Servicio de Cirugía Torácica, Hospital Clinic de Barcelona, Instituto Respiratorio, Universidad de Barcelona, Barcelona), Laura Sánchez Moreno (Servicio de Cirugía Torácica, Hospital Universitario Marqués de Valdecilla, Santander), Carlos Simón (Servicio de Cirugía Torácica, Hospital Universitario Gregorio Marañón, Madrid), Juan Carlos Trujillo-Reyes (Servicio de Cirugía Torácica, Hospital Santa Creu y Sant Pau, Universidad Autónoma de Barcelona, Barcelona), Cipriano López García (Servicio de Cirugía Torácica, Hospital Universitario de Badajoz, Badajoz), Juan José Fibla Alfara (Servicio de Cirugía Torácica, Hospital Universitario Sagrat Cor, Barcelona), Julio Sesma Romero (Servicio de Cirugía Torácica, Hospital General Universitario de Alicante, Alicante) and Florentino Hernando Trancho (Servicio de Cirugía Torácica, Hospital Clínico San Carlos, Madrid).

Funding

All costs related to the start-up and maintenance of the GEVATS database were covered by Ethicon, Johnson & Johnson. The authors had freedom of investigation and full control of the design of the study, methods used, outcome parameters and results, data analysis and production of the written report. The GEVATS was awarded a grant from the Spanish Society of Thoracic Surgery as the best national research project of 2015.

Conflict of interest: none declared.

Data Availability Statement

All relevant data are within the manuscript and its supporting information files.

Author contributions

David Gomez de Antonio: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Supervision; Validation; Writing—original draft; Writing—review & editing. Silvana Crowley Carrasco: Writing—review & editing. Alejandra Romero Román: Writing—review & editing. Ana Royuela: Data curation; Formal analysis; Methodology; Writing—review & editing. Mariana Gil Barturen: Writing—review & editing. Carme Obiols: Writing—review & editing. Sergi Call: Writing—review & editing. Raúl Embún: Writing—review & editing. Ínigo Royo: Writing—review & editing. José Luis Recuero: Writing—review & editing. Alberto Cabanero: Writing—review & editing. Nicolás Moreno: Writing—review & editing. Sergio Bolufer: Data collection. Miguel Congregado: Data collection. Marcelo F. Jimenez: Data collection. Borja Aguinagalde: Data collection. Sergio Amor: Data collection. Miguel Jesús Arrarás: Data collection. Ana Isabel Blanco Orozco: Data collection. Marc Boada: Data collection. Isabel Cal: Data collection. Ángel Cilleruelo Ramos: Data collection. Elena Fernández-Martín: Data collection. Santiago García-Barajas: Data collection. María Dolores García-Jiménez: Data collection. Jose María García-Prim: Data collection. José Alberto Garcia-Salcedos: Data collection. Juan José Gelbenzu-Zazpe: Data collection. Carlos Fernando Giraldo-Ospina: Data collection. María Teresa Gómez Hernández: Data collection. Jorge Hernández: Data collection. Jennifer D. Illana Wolf: Data collection. Alberto Jáuregui Abularach: Data collection. Unai Jiménez: Data collection. Iker López Sanz: Data collection. Néstor J. Martínez-Hernández: Data collection. Elisabeth Martínez-Téllez: Data collection. Lucia Milla Collado: Data collection. Roberto Mongil Poce: Data collection. Francisco Javier Moradiellos-Díez: Data collection. Ramón Moreno-Basalobre: Data collection. Sergio B. Moreno Merino: Data collection. Florencio Quero-Valenzuela: Data collection. María Elena Ramírez-Gil: Data collection. Ricard Ramos-Izquierdo: Data collection. Eduardo Rivo: Data collection. Alberto Rodríguez-Fuster: Data collection. Rafael Rojo-Marcos: Data collection. David Sanchez-Lorente: Data collection. Laura Sánchez Moreno: Data collection. Carlos Simón: Data collection. Juan Carlos Trujillo-Reyes: Data collection. Cipriano López García: Data collection. Juan José Fibla Alfara: Data collection. Julio Sesma Romero: Data collection. Florentino Hernando Trancho: Data collection.

Reviewer information

European Journal of Cardio-Thoracic Surgery thanks Alessandro Brunelli, Alexander Wahba and the other, anonymous reviewer(s) for their contribution to the peer review process of this article.

REFERENCES

1

Cykert

S.

Risk acceptance and risk aversion: patients’ perspectives on lung surgery

.

Thorac Surg Clin

2004

;

14

:

287

–

93

.

2

Cykert

S

,

Kissling

G

,

Hansen

C.

Patient preferences regarding possible outcomes of lung resection: what outcomes should preoperative evaluations target?

Chest

2000

;

117

:

1551

–

9

.

3

Berrisford

R

,

Brunelli

A

,

Rocco

G

,

Tresure

T

,

Utley

M

;

The European Thoracic Surgery Database project

.

Modelling the risk of in-hospital death following lung resection

.

Eur J Cardiothorac Surg

2005

;

28

:

301

–

11

.

4

Fernandez

F

,

Falcoz

P

,

Kozower

B

,

Salati

M

,

Wright

C

,

Brunelli

A.

The Society of Thoracic Surgeons and the European Society of Thoracic Surgeons general surgery databases: joint standardization of variable definitions and terminology

.

Ann Thorac Surg

2015

;

99

:

368

–

76

.

5

Brunelli

A

,

Salati

M

,

Rocco

G

,

Varela

G

,

Van Raemdonck

D

,

Decaluwe

H

et al. ;

ESTS Database Committee

.

European risk models for morbidity (EuroLung1) and mortality (EuroLung2) to predict outcome following anatomic lung resections: an analysis from the European Society of Thoracic Surgeons database

.

Eur J Cardiothorac Surg

2017

;

51

:

490

–

7

.

6

Brunelli

A

,

Cicconi

S

,

Decaluwe

H

,

Szanto

Z

,

Falcoz

P.

Parsimonious Eurolung risk models to predict cardiopulmonary morbidity and mortality following anatomic lung resections: an updated analysis from the European Society of Thoracic Surgeons database

.

Eur J Cardiothorac Surg

2020

;

57

:

455

–

61

.

7

Rich

A

,

Tata

L

,

Stanley

R

,

Free

C

,

Peake

M

,

Baldwin

D

et al.

Lung cancer in England: information from the National Lung Cancer Audit (LUCADA)

.

Lung Cancer

2011

;

72

:

16

–

22

.

8

Kozower

BD

,

Sheng

S

,

O'Brien

SM

,

Liptay

MJ

,

Lau

CL

,

Jones

DR

et al.

STS database risk models: predictors of mortality and major morbidity for lung cancer resection

.

Ann Thorac Surg

2010

;

90

:

875

–

81

.

9

Bernard

A

,

Rivera

C

,

Pages

P

,

Falcoz

P

,

Vicaut

E

,

Dahan

M.

Risk model of in hospital mortality after pulmonary resection for cancer: a national database of the French Society of Thoracic and Cardiovascular Surgery (Epithor)

.

J Thorac Cardiovasc Surg

2011

;

141

:

449

–

58

.

10

Embun

R

,

Royo-Crespo

I

,

Recuero Díaz

J

,

Bolufer

S

,

Call

S

,

Congregado

M

et al.

Spanish Video-Assisted Thoracic Surgery Group: method, auditing, and initial results from a national prospective cohort of patients receiving anatomical lung resections

.

Arch Bronconeumol

2020

;

56

:

718

–

24

.

11

Dindo

D

,

Demartines

N

,

Clavien

P.

Classification of surgical complications: a new proposal with evaluation in a cohort of 6336 patients and results of a survey

.

Ann Surg

2004

;

240

:

205

–

13

.

12

Steyerberg

E.

Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating

.

New York

:

Springer

;

2009

.

13

Steyerberg

E

,

Vickers

A

,

Cook

N

,

Gerds

T

,

Gonen

M

,

Obuchowski

N

et al.

Assessing the performance of prediction models: a framework for traditional and novel measures

.

Epidemiology

2010

;

21

:

128

–

38

.

14

Royston

P

,

Altman

D.

External validation of a cox prognostic model: principles and methods

.

BMC Med Res Methodol

2013

;

13

:

33

.

15

Altman

D

,

Royston

P.

What do we mean by validating a prognostic model?

Statist Med

2000

;

19

:

453

–

73

.

Google Scholar

Crossref

WorldCat

16

Harrell

F

,

Lee

K

,

Mark

D.

Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors

.

Statist Med

1996

;

15

:

361

–

87

.

Google Scholar

Crossref

WorldCat

17

Steyerberg

E

,

Eijkemans

M

,

Harrell

F

,

Habbema

J.

Prognostic modeling with logistic regression analysis: in search of a sensible strategy in small data sets

.

Med Decis Making

2001

;

21

:

45

–

56

.

18

Peek

N

,

Arts

D

,

Bosman

R

,

van der Voort

P

,

de Keizer

N.

External validation of prognostic models for critically ill patients required substantial sample sizes

.

J Clin Epidemiol

2007

;

60

:

491

–

501

.

19

Moons

K

,

Altman

D

,

Vergouwe

Y

,

Royston

P.

Prognosis and prognostic research: application and impact of prognostic models in clinical practice

.

BMJ

2009

;

338

:

b606

.

20

Moons

K

,

Royston

P

,

Vergouwe

Y

,

Grobbee

D

,

Altman

D.

Prognosis and prognostic research: what, why, and how?

BMJ

2009

;

338

:

b375

–

1320

.

21

Moons

KGM

,

de Groot

JAH

,

Bouwmeester

W

,

Vergouwe

Y

,

Mallett

S

,

Altman

DG

et al.

Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist

.

PLoS Med

2014

;

11

:

e1001744

.

22

Brunelli

A

,

Charloux

A

,

Bolliger

C

,

Rocco

G

,

Sculier

JP

,

Varela

G

et al. ;

European Respiratory Society and European Society of Thoracic Surgeons joint task force on fitness for radical therapy

.

ERS/ESTS clinical guidelines on fitness for radical therapy in lung cancer patients (surgery and chemo-radiotherapy)

.

Eur Respir J

2009

;

34

:

17

–

41

.

23

Aguinagalde

B

,

Insausti

A

,

Lopez

I

,

Sanchez

L

,

Bolufer

S

,

Embun

R

;

en representación del Grupo Español de Cirugía Torácica Video-asistida

.

VATS lobectomy morbidity and mortality is lower in patients with the same ppoDLCO: analysis of the database of the Spanish Video-Assisted Thoracic Surgery Group

.

Arch Bronconeumol (Engl Ed)

2021

Feb 13:S0300-2896(21)00055-7. English, Spanish. doi:

10.1016/j.arbres.2021.01.030

. Epub ahead of print. PMID: 33715848.

Google Scholar

OpenURL Placeholder Text

WorldCat

Crossref

24

Altman

D.

Systematic reviews of evaluations of prognostic variables

.

BMJ

2001

;

323

:

224

–

8

.

25

Gondrie

M

,

Janssen

K

,

Moons

K

,

van der Graaf

Y.

A simple adaptation method improved the interpretability of prediction models for composite end points

.

J Clin Epidemiol

2012

;

65

:

946

–

53

.

26

Vergouwe

Y

,

Steyerberg

E

,

Eijkemans

M

,

Habbema

J.

Substantial effective sample sizes were required for external validation studies of predictive logistic regression models

.

J Clin Epidemiol

2005

;

58

:

475

–

83

.

ABBREVIATIONS

AUC ROC
Area under the receiver operating characteristics curve

ESTS
European Society for Thoracic Surgery

GEVATS
Spanish Group of Video-assisted Thoracic Surgery

ppoFEV1
Predicted postoperative forced expiratory volume in 1 s

Author notes

†

GEVATS members are listed in the Acknowledgements section.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Download all slides

Month:	Total Views:
March 2022	37
April 2022	34
May 2022	6
June 2022	19
July 2022	13
August 2022	9
September 2022	34
October 2022	10
November 2022	7
December 2022	13
January 2023	10
February 2023	16
March 2023	13
April 2023	11
May 2023	19
June 2023	14
July 2023	18
August 2023	8
September 2023	29
October 2023	35
November 2023	20
December 2023	32
January 2024	30
February 2024	47
March 2024	19
April 2024	42
May 2024	24
June 2024	15
July 2024	27
August 2024	15
September 2024	25
October 2024	23
November 2024	16
December 2024	25
January 2025	15
February 2025	21
March 2025	27
April 2025	19

Article Contents

External validation of the European Society of Thoracic Surgeons morbidity and mortality risk models

Abstract

INTRODUCTION

MATERIALS AND METHODS

Ethics statement

Modelling cohort

Validation cohort

Statistical analysis

RESULTS

DISCUSSION

There are certain Limitations for this work

CONCLUSION

ACKNOWLEDGEMENTS

Funding

Data Availability Statement

Author contributions

Reviewer information

REFERENCES

ABBREVIATIONS

Author notes

Citations

Views

Altmetric

Email alerts

See also

Companion Article

Citing articles via

Most Read

Most Cited

Article Contents

External validation of the European Society of Thoracic Surgeons morbidity and mortality risk models

Abstract

INTRODUCTION

MATERIALS AND METHODS

Ethics statement

Modelling cohort

Validation cohort

Statistical analysis

RESULTS

DISCUSSION

There are certain Limitations for this work

CONCLUSION

ACKNOWLEDGEMENTS

Funding

Data Availability Statement

Author contributions

Reviewer information

REFERENCES

ABBREVIATIONS

Author notes

Citations

Views

Altmetric

Email alerts

See also

Companion Article

Citing articles via

Most Read

Most Cited

This Feature Is Available To Subscribers Only