Abstract

OBJECTIVES

To develop a simplified version of the Eurolung risk model to predict cardiopulmonary morbidity and 30-day mortality after lung resection from the ESTS database.

METHODS

A total of 82 383 lung resections (63 681 lobectomies, 3617 bilobectomies, 7667 pneumonectomies and 7418 segmentectomies) recorded in the ESTS database (January 2007–December 2018) were analysed. Multiple imputations with chained equations were performed on the predictors included in the original Eurolung models. Stepwise selection was then applied for determining the best logistic model. To develop the parsimonious models, different models were tested eliminating variables one by one starting from the less significant. The models’ prediction power was evaluated estimating area under curve (AUC) with the 10-fold cross-validation technique.

RESULTS

Cardiopulmonary morbidity model (Eurolung1): the best parsimonious Eurolung1 model contains 5 variables. The logit of the parsimonious Eurolung1 model was as follows: −2.852 + 0.021 × age + 0.472 × male −0.015 × ppoFEV1 + 0.662×thoracotomy + 0.324 × extended resection. Pooled AUC is 0.710 [95% confidence interval (CI) 0.677–0.743]. Mortality model (Eurolung2): the best parsimonious model contains 6 variables. The logit of the parsimonious Eurolung2 model was as follows: −6.350 + 0.047 × age + 0.889 × male −0.055 × BMI −0.010 × ppoFEV1 + 0.892 × thoracotomy + 0.983 × pneumonectomy. Pooled AUC is 0.737 (95% CI 0.702–0.770). An aggregate parsimonious Eurolung2 was also generated by repeating the logistic regression after categorization of the numeric variables. Patients were grouped into 7 risk classes showing incremental risk of mortality (P < 0.0001).

CONCLUSIONS

We were able to develop simplified and updated versions of the Eurolung risk models retaining the predictive ability of the full original models. They represent a more user-friendly tool designed to inform the multidisciplinary discussion and shared decision-making process of lung resection candidates.

INTRODUCTION

Eurolung risk models were created from the ESTS database in 2016 to update the former risk models used to adjust morbidity and mortality for quality improvement initiatives [1].

These models were based on a population of approximately 48 000 anatomic lung resections registered in the ESTS database from July 2007 to August 2015. After their publications, the morbidity and mortality models were implemented in the ESTS database to calculate the Composite Performance Score (CPS)—a tool used to assess eligibility for the ESTS Institutional Accreditation programme [2–4].

One of the main disadvantages of the original Eurolung models is their large number of variables used in the equations. In particular, Eurolung1 contains 8 variables and Eurolung2 contains 9 variables. A large number of variables may hamper the completeness of the predicted morbidity and mortality variables that are essential to calculate the CPS, thus limiting the accessibility of the accreditation programme. In fact, in 2017 only 25% of units had a completeness rate >70% in the variables used to calculate the Eurolung models. In addition, a large number of variables negatively affect the efficiency of the model increasing the time needed to fill the required fields and may reduce quality of data by increasing the chance of errors.

Ideally, more parsimonious versions of the models, able to retain at least the same discrimination and calibration of the original ones should be made available. Therefore, the objective of this analysis was to explore the possibility to develop simplified versions of the Eurolung risk models for cardiopulmonary morbidity and 30-day mortality containing the lowest as possible number of risk factors that would ensure at least the same predictive ability as the original full models.

PATIENTS AND METHODS

This is a retrospective analysis on 82 383 anatomic lung resections (63 681 lobectomies, 3617 bilobectomies, 7667 pneumonectomies and 7418 segmentectomies) recorded in the ESTS database from January 2007 to December 2018. The study was reviewed by the Research and Innovation Department of the principle investigator hospital and classified as service evaluation not requiring review by an NHS Research Ethics Committee.

Patients undergoing wedge resections or those with incomplete mortality or complications data were excluded from the analysis.

The ESTS database continues to be a voluntary database collecting data pertaining to all general thoracic surgery procedures. The online version was launched in July 2007 and participation is free for all ESTS members. Data can be input online through the web platform or harvested yearly from existing institutional databases [5].

The ESTS database currently collects data and information from more than 250 European units (http://www.ests.org/collaboration/database_reports.aspx).

Variables and outcomes (including morbidity) are standardized according to the joint STS-ESTS definitions [6].

Only a minority of the data in the ESTS database are formally audited. In particular, only random samples of data from those units that are eligible for the ESTS Institutional Accreditation programme are subject to external audit to assess their quality.

Statistical analysis

As in the previous version of the Eurolung models, the following cardiopulmonary complications listed in the ESTS database were included in the outcome variable: respiratory failure, need for reintubation, prolonged mechanical ventilation >24 h, pneumonia, atelectasis requiring bronchoscopy, pulmonary oedema, pulmonary embolism, ARDS/ALI, arrhythmia requiring treatment, acute myocardial ischaemia, acute cardiac failure, stroke/TIA and acute kidney injury.

Mortality is intended as in-hospital or within 30 days from operation if the patient was discharged.

For the purpose of this study that aimed at updating and simplifying the previous version of Eurolung models, only the predictors previously selected in the Eurolung 1 and 2 were considered when developing the models [1].

In particular, to update the Eurolung1 model (morbidity) the following variables were tested: age, sex, predicted postoperative forced expiratory volume in 1 s (ppoFEV1), presence of coronary artery disease (CAD), cerebrovascular disease (CVD), chronic kidney disease (CKD), thoracotomy (as opposed to minimally invasive surgery) and extended resection (associated with chest wall, Pancoast tumours, atrial or superior vena cava (SVC) resection, diaphragm resection, vertebral resection, pleuropneumonectomy, sleeve pneumonectomies and intrapericardial pneumonectomy).

For updating the Eurolung2 model (mortality), the following variables were tested: age, sex, ppoFEV1, CAD, CVD, pneumonectomy (as opposed to lesser resections), thoracotomy, extended resections (associated with chest wall, Pancoast tumours, atrial or SVC resection, diaphragm resection, vertebral resection, pleuropneumonectomy, sleeve pneumonectomies, intrapericardial pneumonectomy) and body mass index (BMI).

Predicted postoperative FEV1 was calculated in a standardized way by inputting into the database the number of functioning/ventilating segments removed during the operation.

The characteristics of the patients were summarized as median and interquartile range for continuous variables, while categorical covariates were described as frequency counts and proportion percentages.

The analysis was carried out by logistic regression technique using multiple imputations based on chained equations to impute missing data on the predictive covariates [7]. Fifty imputed data sets were used with pooled estimates obtained with Rubin’s rule [8]. Stepwise backward selection technique based on Akaike information criterion was performed to assess the best model [9].

As the purpose of this analysis was to evaluate a parsimonious model, for both mortality and morbidity, attempts were made to evaluate the possible models, without losing prediction performance. The Wald statistic test [10] on each predictor of the full model guided the choice of which predictor to eliminate.

The 10-fold cross-validation technique [11, 12] was applied for internal validation of the model performance in terms of discrimination and calibration [13]. The pooled area under the receiver operating curve [14, 15] was estimated to evaluate discrimination between patients with and without the outcome event. Plots of observed and predicted event rates for deciles groups of patients were used to assess calibration (Supplementary Material, Figs S1 and S2).

The variables selected for the parsimonious mortality model were then used to construct an aggregate model according to the methodology previously described in the original Eurolung study [1]. For this purpose, a threshold effect was searched for numeric variables by using receiver operating characteristic (ROC) analysis and numeric variables were categorized based on the best cut-off value.

A score was assigned to each variable in the model by proportionally weighting the odds ratios and assigning 1 point to the smallest one. A total score was generated for each patient by adding the individual points assigned to each variable. The patients were finally grouped in risk classes according to their scores and similar incidence of morbidity or mortality within the group.

Analyses for model development and validation were performed using R package version 3.3.3 (R Core Team, 2017). The aggregated score was developed using Stata 15.0 statistical software (Stata Corp., College Station, TX, USA).

RESULTS

The characteristics of the patients included in the analysis are shown in Table 1.

Table 1:

Characteristics of the patients

Overall (n = 82 383)
Age (years)64.6 (57.6–71.2)
Male gender53 780 (65%)
BMI (kg/m2)25.1 (22.4–28.3)
CVD2434 (3%)
CAD6725 (8.2%)
Thoracotomy61 252 (74%)
ppoFEV173 (59–87)
Extended resections4722 (5.7%)
CKD4579 (5.6%)
Overall (n = 82 383)
Age (years)64.6 (57.6–71.2)
Male gender53 780 (65%)
BMI (kg/m2)25.1 (22.4–28.3)
CVD2434 (3%)
CAD6725 (8.2%)
Thoracotomy61 252 (74%)
ppoFEV173 (59–87)
Extended resections4722 (5.7%)
CKD4579 (5.6%)

Results are expressed as median and IQR for numeric variables and as count and percentage of the total for categorical variables.

BMI: body mass index; CAD: coronary artery disease; CKD: chronic kidney disease; CVD: cerebrovascular disease; IQR: interquartile range; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

Table 1:

Characteristics of the patients

Overall (n = 82 383)
Age (years)64.6 (57.6–71.2)
Male gender53 780 (65%)
BMI (kg/m2)25.1 (22.4–28.3)
CVD2434 (3%)
CAD6725 (8.2%)
Thoracotomy61 252 (74%)
ppoFEV173 (59–87)
Extended resections4722 (5.7%)
CKD4579 (5.6%)
Overall (n = 82 383)
Age (years)64.6 (57.6–71.2)
Male gender53 780 (65%)
BMI (kg/m2)25.1 (22.4–28.3)
CVD2434 (3%)
CAD6725 (8.2%)
Thoracotomy61 252 (74%)
ppoFEV173 (59–87)
Extended resections4722 (5.7%)
CKD4579 (5.6%)

Results are expressed as median and IQR for numeric variables and as count and percentage of the total for categorical variables.

BMI: body mass index; CAD: coronary artery disease; CKD: chronic kidney disease; CVD: cerebrovascular disease; IQR: interquartile range; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

Major cardiopulmonary complications occurred in 12 955 patients (15.7%) and in-hospital or 30-day mortality in 1851 patients (2.2%).

Mortality was 6.3% after pneumonectomy, 1.8% after lobectomy, 2.2% after bilobectomy and 1.2% after segmentectomy.

Overall 4722 anatomic resections were extended operations with a mortality of 4.0% versus 2.1% in non-extended operations (P < 0.0001).

Analysis of morbidity

The results of the univariable logistic regression analysis including the variables used in the original Eurolung1 model are reported in Supplementary Material, Table S1. All variables remained associated with morbidity in this new updated data set of patients.

When all variables were entered in the multivariable logistic regression analysis, the logit of the full model was as follows: −2.821 + 0.020 × age + 0.455 × male −0.015 × ppoFEV1 + 0.206 × CAD + 0.205 × CVD +0.660 × Thoracotomy + 0.332 × extended resection + 0.214 × CKD.

The best parsimonious model obtained after variable elimination was one containing five variables. The logit of the parsimonious Eurolung1 model was as follows: −2.852 + 0.021 × age + 0.472 × male −0.015 × ppoFEV1 + 0.662 × thoracotomy + 0.324 × extended resection. The results of the logistic regression are shown in Table 2.

Table 2:

Results of the multivariable logistic regression analysis displaying the best parsimonious model (dependent variable: cardiopulmonary morbidity)

CovariateCoefficient estimateOdds ratio95% CIP-value
(Intercept)−2.8520.0580.049–0.067<0.001
Age0.0211.0221.019–1.023<0.001
Male gender0.4721.6021.533–1.677<0.001
ppoFEV 1−0.0150.9850.983–0.987<0.001
Thoracotomy0.6621.9381.842–2.04<0.001
Extended resection0.3241.3831.288–1.484<0.001
CovariateCoefficient estimateOdds ratio95% CIP-value
(Intercept)−2.8520.0580.049–0.067<0.001
Age0.0211.0221.019–1.023<0.001
Male gender0.4721.6021.533–1.677<0.001
ppoFEV 1−0.0150.9850.983–0.987<0.001
Thoracotomy0.6621.9381.842–2.04<0.001
Extended resection0.3241.3831.288–1.484<0.001

CI: confidence interval; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

Table 2:

Results of the multivariable logistic regression analysis displaying the best parsimonious model (dependent variable: cardiopulmonary morbidity)

CovariateCoefficient estimateOdds ratio95% CIP-value
(Intercept)−2.8520.0580.049–0.067<0.001
Age0.0211.0221.019–1.023<0.001
Male gender0.4721.6021.533–1.677<0.001
ppoFEV 1−0.0150.9850.983–0.987<0.001
Thoracotomy0.6621.9381.842–2.04<0.001
Extended resection0.3241.3831.288–1.484<0.001
CovariateCoefficient estimateOdds ratio95% CIP-value
(Intercept)−2.8520.0580.049–0.067<0.001
Age0.0211.0221.019–1.023<0.001
Male gender0.4721.6021.533–1.677<0.001
ppoFEV 1−0.0150.9850.983–0.987<0.001
Thoracotomy0.6621.9381.842–2.04<0.001
Extended resection0.3241.3831.288–1.484<0.001

CI: confidence interval; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

The pooled area under curve (AUC) estimate of the parsimonious model was 0.710 [95% confidence interval (CI) 0.677–0.743], not dissimilar from the one estimated in the full model (0.711, 95% CI 0.677–0.744). Goodness-of-fit testing showed good calibration (χ2 10.7, P = 0.22).

Models with smaller number of variables compromised their discrimination abilities with pooled AUC values <0.7.

Supplementary Material, Fig. S1 shows the calibration of the parsimonious model plotting observed versus predicted morbidity.

In addition, Fig. 1 shows the Locally Weighted Scatterplot Smoothing plots of the observed and predicted morbidity of the full and parsimonious models with the patients ordered by deciles of predicted morbidity (according to the full model). The 3 curves are almost overlapped.

Locally Weighted Scatterplot Smoothing plots of the observed and predicted morbidity of the full and parsimonious models with the patients ordered by deciles of predicted morbidity (according to the full model).
Figure 1:

Locally Weighted Scatterplot Smoothing plots of the observed and predicted morbidity of the full and parsimonious models with the patients ordered by deciles of predicted morbidity (according to the full model).

Analysis of mortality

Supplementary Material, Table S2 shows the results of the univariable logistic regression analysis testing the variables included in the original Eurolung2 model (mortality).

All variables remained associated with mortality in this new updated data set of patients.

When all variables were entered in the multivariable logistic regression analysis, the logit of the full model was as follows: −6.360 + 0.046 × age + 0.866 × male −0.055 × BMI −0.009 × ppoFEV1 + 0.184 × CAD + 0.363 × CVD + 0.889 × thoracotomy + 0.333 × extended resection + 0.968 × pneumonectomy.

The best parsimonious model obtained after variable elimination was one containing 6 variables. The logit of the parsimonious Eurolung2 model was as follows: −6.350 + 0.047 × age + 0.889 × male −0.055 × BMI −0.010 × ppoFEV1 + 0.892 × thoracotomy + 0.983 × pneumonectomy. The results of the logistic regression are shown in Table 3.

Table 3:

Results of the multivariable logistic regression analysis displaying the best parsimonious model (dependent variable: 30-day mortality)

CovariateCoefficient estimateOdds ratio95% CIP-value
(Intercept)−6.3500.0020.001–0.003<0.001
Age0.0471.0481.042–1.054<0.001
Male gender0.8892.4332.133–2.774<0.001
BMI−0.0550.9470.935–0.958<0.001
ppoFEV 1−0.0100.9900.988–0.992<0.001
Thoracotomy0.8922.4412.086–2.854<0.001
Pneumonectomy0.9832.6722.371–3.012<0.001
CovariateCoefficient estimateOdds ratio95% CIP-value
(Intercept)−6.3500.0020.001–0.003<0.001
Age0.0471.0481.042–1.054<0.001
Male gender0.8892.4332.133–2.774<0.001
BMI−0.0550.9470.935–0.958<0.001
ppoFEV 1−0.0100.9900.988–0.992<0.001
Thoracotomy0.8922.4412.086–2.854<0.001
Pneumonectomy0.9832.6722.371–3.012<0.001

BMI: body mass index; CI: confidence interval; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

Table 3:

Results of the multivariable logistic regression analysis displaying the best parsimonious model (dependent variable: 30-day mortality)

CovariateCoefficient estimateOdds ratio95% CIP-value
(Intercept)−6.3500.0020.001–0.003<0.001
Age0.0471.0481.042–1.054<0.001
Male gender0.8892.4332.133–2.774<0.001
BMI−0.0550.9470.935–0.958<0.001
ppoFEV 1−0.0100.9900.988–0.992<0.001
Thoracotomy0.8922.4412.086–2.854<0.001
Pneumonectomy0.9832.6722.371–3.012<0.001
CovariateCoefficient estimateOdds ratio95% CIP-value
(Intercept)−6.3500.0020.001–0.003<0.001
Age0.0471.0481.042–1.054<0.001
Male gender0.8892.4332.133–2.774<0.001
BMI−0.0550.9470.935–0.958<0.001
ppoFEV 1−0.0100.9900.988–0.992<0.001
Thoracotomy0.8922.4412.086–2.854<0.001
Pneumonectomy0.9832.6722.371–3.012<0.001

BMI: body mass index; CI: confidence interval; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

The pooled AUC estimate of the parsimonious model was 0.737 (95% CI 0.702–0.770), not dissimilar from the one estimated in the full model [0.739 (95% CI 0.705–0.773)]. Goodness-of-fit testing showed good calibration of the model (χ2 6.9, P = 0.55).

Models with smaller number of variables compromised their discrimination abilities with pooled AUC values <0.7.

Supplementary Material, Fig. S2 shows the calibration of the parsimonious model plotting observed versus predicted mortality.

In addition, Fig. 2 shows the Locally Weighted Scatterplot Smoothing plots of the observed and predicted mortality of the full and parsimonious models with the patients ordered by deciles of predicted mortality (based on the full model). The 3 curves appear almost overlapped.

Locally Weighted Scatterplot Smoothing plots of the observed and predicted mortality of the full and parsimonious models with the patients ordered by deciles of predicted mortality (according to the full model).
Figure 2:

Locally Weighted Scatterplot Smoothing plots of the observed and predicted mortality of the full and parsimonious models with the patients ordered by deciles of predicted mortality (according to the full model).

Aggregate Eurolung2 model

An aggregate parsimonious Eurolung2 was also generated by repeating the logistic regression after categorization of the numeric variables (age, ppoFEV1 and BMI). The following best cut-off values were found associated with mortality using ROC analysis: age >70, ppoFEV1 <70% and BMI <18.5 kg/m2.

A score of 1 point was assigned to the variable with the smallest odd ratio at logistic regression (age > 70 and ppoFEV1 < 70%) and proportionally weighting the other 4 variables: male sex, BMI <18.5 and thoracotomy 2.5 points; pneumonectomy 3 points (Table 4).

Table 4:

Points assigned to each variable in the parsimonious mortality model according to their regression coefficient

VariablesPoints
Age >701
ppoFEV1 <70%1
Male gender2.5
BMI <18.5 kg/m22.5
Thoracotomy (as opposed to MITS)2.5
Pneumonectomy (as opposed to lesser resections)3
VariablesPoints
Age >701
ppoFEV1 <70%1
Male gender2.5
BMI <18.5 kg/m22.5
Thoracotomy (as opposed to MITS)2.5
Pneumonectomy (as opposed to lesser resections)3

Numeric variables were categorized using ROC analysis.

BMI: body mass index; MITS: minimally invasive thoracic surgery; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

Table 4:

Points assigned to each variable in the parsimonious mortality model according to their regression coefficient

VariablesPoints
Age >701
ppoFEV1 <70%1
Male gender2.5
BMI <18.5 kg/m22.5
Thoracotomy (as opposed to MITS)2.5
Pneumonectomy (as opposed to lesser resections)3
VariablesPoints
Age >701
ppoFEV1 <70%1
Male gender2.5
BMI <18.5 kg/m22.5
Thoracotomy (as opposed to MITS)2.5
Pneumonectomy (as opposed to lesser resections)3

Numeric variables were categorized using ROC analysis.

BMI: body mass index; MITS: minimally invasive thoracic surgery; ppoFEV1: predicted postoperative forced expiratory volume in 1 s.

Points were summed for each patient to obtain a total aggregate score (range 0–12.5). Patients were grouped into 7 risk classes showing incremental risk of mortality (P < 0.0001) as shown in Table 5.

Table 5:

Patients stratified in risk classes according to their aggregate Eurolung2 risk scores

Risk class (score)Mortality %Number of patientsNumber of deaths95% CI
0–2.50.5422 4461220.45–0.65
3–51.4629 9784391.33–1.61
5.5–6.53.1918 2845842.95–3.45
7–7.54.6352902454.10–5.23
8–96.3753713425.74–7.05
9.5–1211.519991159.67–13.65
12.526.6715410.0–54.31
Risk class (score)Mortality %Number of patientsNumber of deaths95% CI
0–2.50.5422 4461220.45–0.65
3–51.4629 9784391.33–1.61
5.5–6.53.1918 2845842.95–3.45
7–7.54.6352902454.10–5.23
8–96.3753713425.74–7.05
9.5–1211.519991159.67–13.65
12.526.6715410.0–54.31

CI: confidence interval.

Table 5:

Patients stratified in risk classes according to their aggregate Eurolung2 risk scores

Risk class (score)Mortality %Number of patientsNumber of deaths95% CI
0–2.50.5422 4461220.45–0.65
3–51.4629 9784391.33–1.61
5.5–6.53.1918 2845842.95–3.45
7–7.54.6352902454.10–5.23
8–96.3753713425.74–7.05
9.5–1211.519991159.67–13.65
12.526.6715410.0–54.31
Risk class (score)Mortality %Number of patientsNumber of deaths95% CI
0–2.50.5422 4461220.45–0.65
3–51.4629 9784391.33–1.61
5.5–6.53.1918 2845842.95–3.45
7–7.54.6352902454.10–5.23
8–96.3753713425.74–7.05
9.5–1211.519991159.67–13.65
12.526.6715410.0–54.31

CI: confidence interval.

DISCUSSION

Background and objective

Outcome indicators remain essential elements used to audit clinical performance and evaluate surgical risk. They need to be interpreted in the context of other factors such as patient values and processes of care. In addition, outcomes need risk adjustment to reflect the variability of baseline characteristics and the underlying comorbidities present in the analysed population. Risk adjustment is performed using risk models which usually incorporate several patient-related and procedure-related factors. The Eurolung risk models were recently developed for this purpose from a population of about 48 000 anatomic lung resections registered in the ESTS database [1]. These models are currently the ones used to calculate the risk-adjusted morbidity and mortality values incorporated in the CPS. The latter is the tool used to assess eligibility for the ESTS Institutional Accreditation Programme [2–4]. In addition, the Eurolung models are meant to be used in the clinical practice to provide an estimate of the risk of surgery when counselling patients in the clinic or discussing high-risk patients in a multidisciplinary setting.

One of the problems of the original Eurolung equations is their large number of variables. The morbidity model contains 8 variables and the mortality model contains 9 variables. Despite the fact that these numbers are similar or even smaller as compared with other existing risk models such as the ones derived from Epithor [16] or from the STS database [17], this may represent practical problems. First of all, a large number of variables increase the risk of errors and missing data, affecting the quality and utilization of the database and model. In fact, in the most recent years only a minority of units had a completeness rate >70% (which is the threshold selected to ensure data quality in the ESTS accreditation programme) in the predicted morbidity and mortality variables which are essential in calculating the CPS. Second, it impacts on the efficiency of the model increasing the time and labour of data input. Hence, the rationale of this study is to generate simplified versions of the models. The challenge was to identify a model containing the smallest number of variables while retaining the discrimination and calibration of the full models.

Main findings

The main finding of this analysis is the development of simplified models for morbidity and 30-day mortality. The parsimonious models were developed from a population of over 82 000 patients registered in the ESTS database from 2007 through 2018. The sample size is 34 000 patients larger than the one used to develop the original Eurolung models in 2016.

The simplified models contain now 5 variables (instead of 8) for morbidity and 6 variables (instead of 9) for mortality. These were the minimum number of variables ensuring an AUC above 0.70 and similar values compared to the ones obtained applying the full models.

The AUC of the parsimonious morbidity model was 0.710, which is higher than the original version published in 2016 (0.68). The AUC of the parsimonious mortality model was 0.737, which is similar to the 2016 original version (0.74).

No attempt was made to introduce new variables in the model as no new variables were introduced in the ESTS database since the development of the original Eurolung models in 2016. Therefore, the analysis started from the variables included in the original logistic regression equations (1).

They all have high content and face validity and represent widely accepted risk factors [2, 17]. After the variable selection process, the 2 parsimonious models contained some common variables associated with both morbidity and mortality (age, sex, ppoFEV1 and thoracotomy), whereas some variables were specific for either morbidity (extended resections) or mortality (BMI and pneumonectomy).

Similar to the original Eurolung publication, we created an aggregate version of the parsimonious mortality model to be used as a simple risk stratification tool to assist in the clinical practice. The aggregate score was generated by proportionally weighing the variables included in the simplified model and allowed to group patients into 7 classes of risk with incremental mortality rate.

The patients in the lowest risk class had a 0.5% mortality rate, while those in the highest risk class had 27% mortality rate.

Limitations

Limitations inherent to any retrospective big data analysis may apply to this study as well.

The retrospective multicentre design of the study may imply an unavoidable selection bias. Patients might have been variably selected for an operation in the different units contributing to the ESTS database. Patients’ selection can affect outcomes and may cause an unbalanced representation of variables at different levels of risk.

Morbidity has been arbitrarily defined by including complications historically considered as adverse outcome in previous outcome analyses. However, the list includes only the so-called major cardiopulmonary complications excluding other complications (i.e. prolonged air leak, wound infection, bronchopleural fistula, etc.) which may as well be associated with prolonged hospital stay and increased costs of care. The list of complications has not been modified in order to ensure consistency with previous analyses. In addition, complications have not been weighted as a grading of complications is not available in the ESTS database.

Finally, although the individual complications have been strictly defined according to the joint STS–ESTS definition of variables and outcomes [6], the quality of this variable may be affected by entry error, miscoding or under-reporting.

Mortality is certainly a less equivocal outcome as compared with morbidity. Nevertheless, the ESTS database does not collect mortality longer than 30 days. Ninety-day mortality has been recently considered a more representative end point to assess quality of care after lung resection [18–24].

The model was internally validated using a 10-fold cross-validation technique. We elected not to split the sample into derivation and validation groups according to previous evidence that this technique is inferior to internal resampling [25]. Even randomly splitting the database into 2 samples would not ensure independence of the 2 samples. External validation performed on an independent sample of patients would be ideal in a future to assess the performance of the simplified model.

Finally, the results from this analysis should be interpreted taking into consideration 2 important aspects. First, the ESTS database captures only a fraction of all thoracic surgical activity in Europe. It has been estimated that <30% of all lung resection cases performed annually in Europe are recorded in the database. Second, the ESTS database is not subject to formal audit. Obviously, the quality of data is of paramount importance to ensure credibility and reliability of the models. Due to lack of resources, only a minor proportion of data are audited in conjunction with the Institutional accreditation programme.

Clinical implications

The simplified Eurolung risk models will replace the original equations to estimate the risk-adjusted outcomes as part of the CPS used to assess eligibility for the ESTS Accreditation programme. This is the main purpose of these risk-adjusting models. Similar to other population-based models, they are not meant to be used as instruments of patient selection at an individual level. The reduced number of variables in the model will hopefully increase the completeness rate of the predicted morbidity and mortality variables necessary to calculate the CPS.

In addition, the reduction of the number of variables in the models will hopefully simplify their use and facilitate their widespread adoption to quickly estimate the surgical risk in clinic and during multidisciplinary discussion. For this purpose, the parsimonious models have been used to develop an app that will constitute a user-friendly aid tool for the practising physicians. The multiplatform Eurolung app is available and free to download from Google Play or App Store.

A further version designed to simplify the model application is the aggregate score. An aggregate risk score for mortality has been developed by grouping the patients into classes of risk associated with incremental risk of mortality. The lower number of variables will simplify the calculation of the risk score.

While we discourage to use the risk score to select patients for operation – a process that should not leave out a more holistic and personalized evaluation – it can however represent a valuable aid tool to screen patients for further functional testing during preoperative workup and to inform the shared decision-making process, allowing patients to balance their options regarding treatment modality.

Presented at the 27th European Conference on General Thoracic Surgery, Dublin, Ireland, 9–12 June 2019.

Ingolf Vogt-Moykopf ESTS memorial paper 2019.

ACKNOWLEDGEMENTS

The authors are most indebted to all centres contributing to the ESTS database for their continuous commitment and active participation (http://www.ests.org/collaboration/database_contributors_list.aspx). Their contribution is pivotal to the advancement of science in our specialty and ensuring a high quality of care for our patients.

Conflict of interest: none declared.

Author contributions

Alessandro Brunelli: conceptualization; data curation; formal analysis; methodology; project administration; writing – original draft. Silvia Cicconi: data curation; formal analysis; methodology. Herbert Decaluwe: writing – review & editing. Zalan Szanto: writing – review & editing. Pierre Emmanuel Falcoz: writing – review & editing.

REFERENCES

1

Brunelli
A
,
Salati
M
,
Rocco
G
,
Varela
G
,
Van Raemdonck
D
,
Decaluwe
H
et al.
European risk models for morbidity (EuroLung1) and mortality (EuroLung2) to predict outcome following anatomic lung resections: an analysis from the European Society of Thoracic Surgeons database
.
Eur J Cardiothorac Surg
2017
;
5
:
490
7
.

2

Brunelli
A
,
Rocco
G
,
Van Raemdonck
D
,
Varela
G
,
Dahan
M.
Lessons learned from the European thoracic surgery database: the Composite Performance Score
.
Eur J Surg Oncol
2010
;
36
(
Suppl 1
):
S93
9
.

3

Brunelli
A
,
Berrisford
RG
,
Rocco
G
,
Varela
G
; European Society of Thoracic Surgeons Database Committee
. The European Thoracic Database project: Composite Performance Score to measure quality of care after major lung resection
.
Eur J Cardiothorac Surg
2009
;
35
:
769
74
.

4

Brunelli
A.
European Society of Thoracic Surgeons institutional accreditation
.
J Thorac Dis
2018
;
10
(
Suppl 29
):
S3539
41
.

5

Salati
M
,
Brunelli
A
,
Decaluwe
H
,
Szanto
Z
,
Dahan
M
,
Varela
G
et al.
Report from the European Society of Thoracic Surgeons Database 2017: patterns of care and perioperative outcomes of surgery for malignant lung neoplasm
.
Eur J Cardiothorac Surg
2017
;
1
:
1041
8
.

6

Fernandez
FG
,
Falcoz
PE
,
Kozower
BD
,
Salati
M
,
Wright
CD
,
Brunelli
A.
The Society of Thoracic Surgeons and the European Society of Thoracic Surgeons general thoracic surgery databases: joint standardization of variable definitions and terminology
.
Ann Thorac Surg
2015
;
99
:
368
76
.

7

White
IR
,
Royston
P
,
Wood
AM.
Multiple imputation using chained equations: issues and guidance for practice
.
Statist Med
2011
;
30
:
377
99
.

8

Rubin
DB.
Multiple Imputation for Nonresponse in Surveys
.
New York, NY
:
John Wiley & Sons
,
1987
.

9

Akaike
H.
A new look at the statistical model identification
.
IEEE Trans Automat Contr
1974
;
19
:
716
23
.

10

Wald
A.
Tests of statistical hypotheses concerning several parameters when the number of observations is large
.
Trans Am Math Soc
1943
;
54
:
426
82
.

11

Schumacher
M
,
Hollander
N
,
Sauerbrei
W.
Resampling and cross-validation techniques: a tool to reduce bias caused by model building?
Statist Med
1997
;
16
:
2813
27
.

12

Burman
P.
A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods
.
Biometrika
1989
;
76
:
503
14
.

13

Altman
DG
,
Vergouwe
Y
,
Royston
P
,
Moons
K.
Prognosis and prognostic research: validating a prognostic model
.
BMJ
2009
;
338
:
b605.

14

Wood
AM
,
Royston
P
,
White
IR.
The estimation and use of predictions for the assessment of model performance using large samples with multiply imputed data
.
Biom J
2015
;
57
:
614
32
.

15

Marshall
A
,
Altman
DG
,
Holder
RL
,
Royston
P.
Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines
.
BMC Med Res Methodol
2009
;
9
:
57.

16

Bernard
A
,
Rivera
C
,
Pages
PB
,
Falcoz
PE
,
Vicaut
E
,
Dahan
M.
Risk model of in-hospital mortality after pulmonary resection for cancer: a national database of the French Society of Thoracic and Cardiovascular Surgery (Epithor)
.
J Thorac Cardiovasc Surg
2011
;
141
:
449
58
.

17

Fernandez
FG
,
Kosinski
A
,
Burfeind
W
,
Park
B
,
DeCamp
MM
,
Seder
C
et al.
STS lung cancer resection risk model: higher quality data and superior outcomes
.
Ann Thorac Surg
2016
;
102
:
370
7
.

18

Damhuis
RA
,
Wijnhoven
BP
,
Plaisier
PW
,
Kirkels
WJ
,
Kranse
R
,
van Lanschot
JJ.
Comparison of 30-day, 90-day and in-hospital postoperative mortality for eight different cancer types
.
Br J Surg
2012
;
99
:
1149
54
.

19

D'Amico
TA.
Defining and improving postoperative care
.
J Thorac Cardiovasc Surg
2014
;
148
:
1792
3
.

20

Pezzi
CM
,
Mallin
K
,
Mendez
AS
,
Greer Gay
E
,
Putnam
JB
Jr
.
Ninety-day mortality after resection for lung cancer is nearly double 30-day mortality
.
J Thorac Cardiovasc Surg
2014
;
148
:
2269
77
.

21

Powell
HA
,
Tata
LJ
,
Baldwin
DR
,
Stanley
RA
,
Khakwani
A
,
Hubbard
RB.
Early mortality after surgical resection for lung cancer: an analysis of the English National Lung cancer audit
.
Thorax
2013
;
68
:
826
34
.

22

McMillan
RR
,
Berger
A
,
Sima
CS
,
Lou
F
,
Dycoco
J
,
Rusch
V
et al.
Thirty-day mortality underestimates the risk of early death after major resections for thoracic malignancies
.
Ann Thorac Surg
2014
;
98
:
1769
74
.

23

Hu
Y
,
McMurry
TL
,
Wells
KM
,
Isbell
JM
,
Stukenborg
GJ
,
Kozower
BD.
Postoperative mortality is an inadequate quality indicator for lung cancer resection
.
Ann Thorac Surg
2014
;
97
:
973.

24

Bryant
AS
,
Rudemiller
K
,
Cerfolio
RJ.
The 30- versus 90-day operative mortality after pulmonary resection
.
Ann Thorac Surg
2010
;
89
:
1717
22
.

25

Brunelli
A
,
Rocco
G.
Internal validation of risk models in lung resection surgery: bootstrap versus training-and-test sampling
.
J Thorac Cardiovasc Surg
2006
;
131
:
1243
7
.

ABBREVIATIONS

    ABBREVIATIONS
     
  • BMI

    Body mass index

  •  
  • CAD

    Coronary artery disease

  •  
  • CI

    Confidence interval

  •  
  • CKD

    Chronic kidney disease

  •  
  • CPS

    Composite Performance Score

  •  
  • CVD

    Cerebrovascular disease

  •  
  • ppoFEV1

    Predicted postoperative forced expiratory volume in 1 s

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data