Using datasets to ascertain the generalizability of clinical cohorts: the example of the European QUALity study on the treatment of advanced chronic kidney disease

Distribution of socio-demographic characteristics in the three cohorts

Patient characteristics	PCC (n = 633)	SCC (n = 2464)	EQUAL (n = 250)	P-value for comparison between the three cohorts
Age at index date (years), mean (95% CI)	86.3 (85.8–86.8)	79.7 (79.4–79.9)	76.6 (75.8–77.4)	<0.001
Male, n (%)	220 (34.8)	1266 (51.4)	150 (60.0)	<0.001
Townsend^a quintile, n (%)
1	106 (23.6)	469 (25.5)	44 (17.6)	<0.001
2	98 (21.6)	427 (23.2)	44 (17.6)
3	102 (22.5)	377 (20.5)	43 (17.2)
4	97 (21.4)	317 (17.2)	48 (19.2)
5	51 (11.2)	251 (13.6)	71 (28.4)
Rurality, n (%)
Urban	330 (72.4)	1482 (80.3)	216 (86.4)	<0.001
Town and fringe	91 (20.0)	227 (12.3)	18 (7.2)
Village and hamlet	35 (7.7)	136 (7.4)	16 (6.4)
CCI, median (IQR), range	4 (3–5), 2–10	4 (3–5), 2–11	4 (2–5), 2–10	0.0002

Patient characteristics	PCC (n = 633)	SCC (n = 2464)	EQUAL (n = 250)	P-value for comparison between the three cohorts
Age at index date (years), mean (95% CI)	86.3 (85.8–86.8)	79.7 (79.4–79.9)	76.6 (75.8–77.4)	<0.001
Male, n (%)	220 (34.8)	1266 (51.4)	150 (60.0)	<0.001
Townsend^a quintile, n (%)
1	106 (23.6)	469 (25.5)	44 (17.6)	<0.001
2	98 (21.6)	427 (23.2)	44 (17.6)
3	102 (22.5)	377 (20.5)	43 (17.2)
4	97 (21.4)	317 (17.2)	48 (19.2)
5	51 (11.2)	251 (13.6)	71 (28.4)
Rurality, n (%)
Urban	330 (72.4)	1482 (80.3)	216 (86.4)	<0.001
Town and fringe	91 (20.0)	227 (12.3)	18 (7.2)
Village and hamlet	35 (7.7)	136 (7.4)	16 (6.4)
CCI, median (IQR), range	4 (3–5), 2–10	4 (3–5), 2–11	4 (2–5), 2–10	0.0002

a

1 = least deprived, 5 = most deprived.

Table 1.

Distribution of socio-demographic characteristics in the three cohorts

Patient characteristics	PCC (n = 633)	SCC (n = 2464)	EQUAL (n = 250)	P-value for comparison between the three cohorts
Age at index date (years), mean (95% CI)	86.3 (85.8–86.8)	79.7 (79.4–79.9)	76.6 (75.8–77.4)	<0.001
Male, n (%)	220 (34.8)	1266 (51.4)	150 (60.0)	<0.001
Townsend^a quintile, n (%)
1	106 (23.6)	469 (25.5)	44 (17.6)	<0.001
2	98 (21.6)	427 (23.2)	44 (17.6)
3	102 (22.5)	377 (20.5)	43 (17.2)
4	97 (21.4)	317 (17.2)	48 (19.2)
5	51 (11.2)	251 (13.6)	71 (28.4)
Rurality, n (%)
Urban	330 (72.4)	1482 (80.3)	216 (86.4)	<0.001
Town and fringe	91 (20.0)	227 (12.3)	18 (7.2)
Village and hamlet	35 (7.7)	136 (7.4)	16 (6.4)
CCI, median (IQR), range	4 (3–5), 2–10	4 (3–5), 2–11	4 (2–5), 2–10	0.0002

Patient characteristics	PCC (n = 633)	SCC (n = 2464)	EQUAL (n = 250)	P-value for comparison between the three cohorts
Age at index date (years), mean (95% CI)	86.3 (85.8–86.8)	79.7 (79.4–79.9)	76.6 (75.8–77.4)	<0.001
Male, n (%)	220 (34.8)	1266 (51.4)	150 (60.0)	<0.001
Townsend^a quintile, n (%)
1	106 (23.6)	469 (25.5)	44 (17.6)	<0.001
2	98 (21.6)	427 (23.2)	44 (17.6)
3	102 (22.5)	377 (20.5)	43 (17.2)
4	97 (21.4)	317 (17.2)	48 (19.2)
5	51 (11.2)	251 (13.6)	71 (28.4)
Rurality, n (%)
Urban	330 (72.4)	1482 (80.3)	216 (86.4)	<0.001
Town and fringe	91 (20.0)	227 (12.3)	18 (7.2)
Village and hamlet	35 (7.7)	136 (7.4)	16 (6.4)
CCI, median (IQR), range	4 (3–5), 2–10	4 (3–5), 2–11	4 (2–5), 2–10	0.0002

a

1 = least deprived, 5 = most deprived.

There was a greater proportion of patients in the EQUAL study in the most deprived Townsend quintile (28.4%) compared with the PCC and SCC (11.2 and 13.6%). EQUAL participants were also more likely to be living in an urban postcode (86.4%) than patients in the PCC and SCC (72.4 and 80.3%, respectively). The range of CCI in the SCC was greater when compared with the PCC and EQUAL cohorts.

Although the overall medication burden was similar between the three cohorts, the EQUAL cohort had a greater proportion of patients on antihypertensives, lipid-lowering drugs and thromboembolic/antiplatelet drugs when compared with the SCC and PCC (Supplementary data, Table S1).

The absolute mean values of laboratory variables and BP readings were clinically similar between the three cohorts. However, there was a clinically relevant difference in ACR, with the patients in the EQUAL cohort having an ACR 2 and 8 times that of the SCC and PCC, respectively (Supplementary data, Table S2). The greater difference between the EQUAL and PCC compared with the difference between the EQUAL and SCC could potentially reflect referral to secondary care and ESKD progression.

Variables associated with participation/non-participation in EQUAL

Patients participating in EQUAL were compared with the SCC of presumed non-participants in EQUAL to explore variables that are associated with being in one cohort versus the other (Table 2). Increasing age was associated with non-participation in EQUAL, with patients ≥85 years of age having 75% reduced odds of participating. Women had 29% reduced odds of participating, and patients in the Townsend quintiles 4 and 5 had 1.6- and 3.0-fold increased odds of participating when compared with the least-deprived patients (Townsend quintile 1). An increasing comorbidity burden was also associated with non-participation in EQUAL: patients with a CCI of 4–5 and ≥6 were 30% less likely to participate compared with those with a CCI <4. Patients who were less likely to take part in EQUAL included those with heart disease (47% reduced odds), peripheral vascular disease (PVD; 42% reduced odds) and rheumatological disease (69% reduced odds). Patients with current or a history of cancer had 40% increased odds of participating.

Table 2.

Univariable model showing variables associated with participation in EQUAL

SCC (n = 1436)= 0, EQUAL cohort (n = 242)= 1	Univariable model
	OR (95% CI)	P-value
Age (years)
≥65–<70	1.0	–
≥70–<75	0.65 (0.43–0.97)	0.04
≥75–<80	0.48 (0.32–0.72)	<0.001
≥80–<85	0.39 (0.25–0.59)	<0.001
≥85	0.25 (0.15–0.40)	<0.001
Male (ref)	0.71 (0.54–0.92)	0.009
Townsend (quintile; 1 = least, 5 = most deprived)
1	1.0	–
2	1.10 (0.71–1.70)	0.68
3	1.21 (0.78–1.89)	0.39
4	1.61 (1.05–2.49)	0.03
5	3.02 (2.01–4.53)	<0.001
Rurality
Urban	1.0	–
Town/village	0.64 (0.44–0.94)	0.02
Haemoglobin (g/dL)
[≥10 (ref), <10]	0.72 (0.51–1.03)	0.06
Albumin (g/L)
[≥35 (ref), <35]	1.04 (0.76–1.42)	0.82
<120	0.73 (0.47–1.41)	0.17
Systolic BP (mmHg)
≥120–≤140	1.0	–
>140	1.77 (1.33–2.35)	<0.001
<70	0.97 (0.72–1.30)	0.84
Diastolic BP (mmHg)
≥70–≤80	1.0	–
>80	1.18 (0.82–1.70)	0.38
CCI
2–3	1.0	–
4–5	0.69 (0.52–0.91)	0.009
≥6	0.68 (0.46–1.0)	0.05
Individual CCI components
Cardiac (ref = absent)	0.53 (0.38–0.73)	<0.001
PVD (ref = absent)	0.58 (0.38–0.88)	0.007
Pulmonary (ref = absent)	0.80 (0.57–1.13)	0.22
Diabetes (ref = absent)	0.94 (0.72–1.22)	0.65
CVA (ref = absent)	0.75 (0.51–1.13)	0.16
Cancer (ref = absent)	1.41 (1.03–1.93)	0.04
Rheumatology (ref = absent)	0.31 (0.15–0.65)	0.0002
Other (ref = absent)	0.34 (0.18–0.66)	0.0002
Drug count (quintile)
1	–	–
2	1.36 (0.95–1.96)	0.1
3	0.94 (0.66–1.33)	0.72
4	0.99 (0.70–1.41)	0.94

SCC (n = 1436)= 0, EQUAL cohort (n = 242)= 1	Univariable model
	OR (95% CI)	P-value
Age (years)
≥65–<70	1.0	–
≥70–<75	0.65 (0.43–0.97)	0.04
≥75–<80	0.48 (0.32–0.72)	<0.001
≥80–<85	0.39 (0.25–0.59)	<0.001
≥85	0.25 (0.15–0.40)	<0.001
Male (ref)	0.71 (0.54–0.92)	0.009
Townsend (quintile; 1 = least, 5 = most deprived)
1	1.0	–
2	1.10 (0.71–1.70)	0.68
3	1.21 (0.78–1.89)	0.39
4	1.61 (1.05–2.49)	0.03
5	3.02 (2.01–4.53)	<0.001
Rurality
Urban	1.0	–
Town/village	0.64 (0.44–0.94)	0.02
Haemoglobin (g/dL)
[≥10 (ref), <10]	0.72 (0.51–1.03)	0.06
Albumin (g/L)
[≥35 (ref), <35]	1.04 (0.76–1.42)	0.82
<120	0.73 (0.47–1.41)	0.17
Systolic BP (mmHg)
≥120–≤140	1.0	–
>140	1.77 (1.33–2.35)	<0.001
<70	0.97 (0.72–1.30)	0.84
Diastolic BP (mmHg)
≥70–≤80	1.0	–
>80	1.18 (0.82–1.70)	0.38
CCI
2–3	1.0	–
4–5	0.69 (0.52–0.91)	0.009
≥6	0.68 (0.46–1.0)	0.05
Individual CCI components
Cardiac (ref = absent)	0.53 (0.38–0.73)	<0.001
PVD (ref = absent)	0.58 (0.38–0.88)	0.007
Pulmonary (ref = absent)	0.80 (0.57–1.13)	0.22
Diabetes (ref = absent)	0.94 (0.72–1.22)	0.65
CVA (ref = absent)	0.75 (0.51–1.13)	0.16
Cancer (ref = absent)	1.41 (1.03–1.93)	0.04
Rheumatology (ref = absent)	0.31 (0.15–0.65)	0.0002
Other (ref = absent)	0.34 (0.18–0.66)	0.0002
Drug count (quintile)
1	–	–
2	1.36 (0.95–1.96)	0.1
3	0.94 (0.66–1.33)	0.72
4	0.99 (0.70–1.41)	0.94

Table 2.

Open in new tab Download slide

Univariable model showing variables associated with participation in EQUAL

SCC (n = 1436)= 0, EQUAL cohort (n = 242)= 1	Univariable model
	OR (95% CI)	P-value
Age (years)
≥65–<70	1.0	–
≥70–<75	0.65 (0.43–0.97)	0.04
≥75–<80	0.48 (0.32–0.72)	<0.001
≥80–<85	0.39 (0.25–0.59)	<0.001
≥85	0.25 (0.15–0.40)	<0.001
Male (ref)	0.71 (0.54–0.92)	0.009
Townsend (quintile; 1 = least, 5 = most deprived)
1	1.0	–
2	1.10 (0.71–1.70)	0.68
3	1.21 (0.78–1.89)	0.39
4	1.61 (1.05–2.49)	0.03
5	3.02 (2.01–4.53)	<0.001
Rurality
Urban	1.0	–
Town/village	0.64 (0.44–0.94)	0.02
Haemoglobin (g/dL)
[≥10 (ref), <10]	0.72 (0.51–1.03)	0.06
Albumin (g/L)
[≥35 (ref), <35]	1.04 (0.76–1.42)	0.82
<120	0.73 (0.47–1.41)	0.17
Systolic BP (mmHg)
≥120–≤140	1.0	–
>140	1.77 (1.33–2.35)	<0.001
<70	0.97 (0.72–1.30)	0.84
Diastolic BP (mmHg)
≥70–≤80	1.0	–
>80	1.18 (0.82–1.70)	0.38
CCI
2–3	1.0	–
4–5	0.69 (0.52–0.91)	0.009
≥6	0.68 (0.46–1.0)	0.05
Individual CCI components
Cardiac (ref = absent)	0.53 (0.38–0.73)	<0.001
PVD (ref = absent)	0.58 (0.38–0.88)	0.007
Pulmonary (ref = absent)	0.80 (0.57–1.13)	0.22
Diabetes (ref = absent)	0.94 (0.72–1.22)	0.65
CVA (ref = absent)	0.75 (0.51–1.13)	0.16
Cancer (ref = absent)	1.41 (1.03–1.93)	0.04
Rheumatology (ref = absent)	0.31 (0.15–0.65)	0.0002
Other (ref = absent)	0.34 (0.18–0.66)	0.0002
Drug count (quintile)
1	–	–
2	1.36 (0.95–1.96)	0.1
3	0.94 (0.66–1.33)	0.72
4	0.99 (0.70–1.41)	0.94

SCC (n = 1436)= 0, EQUAL cohort (n = 242)= 1	Univariable model
	OR (95% CI)	P-value
Age (years)
≥65–<70	1.0	–
≥70–<75	0.65 (0.43–0.97)	0.04
≥75–<80	0.48 (0.32–0.72)	<0.001
≥80–<85	0.39 (0.25–0.59)	<0.001
≥85	0.25 (0.15–0.40)	<0.001
Male (ref)	0.71 (0.54–0.92)	0.009
Townsend (quintile; 1 = least, 5 = most deprived)
1	1.0	–
2	1.10 (0.71–1.70)	0.68
3	1.21 (0.78–1.89)	0.39
4	1.61 (1.05–2.49)	0.03
5	3.02 (2.01–4.53)	<0.001
Rurality
Urban	1.0	–
Town/village	0.64 (0.44–0.94)	0.02
Haemoglobin (g/dL)
[≥10 (ref), <10]	0.72 (0.51–1.03)	0.06
Albumin (g/L)
[≥35 (ref), <35]	1.04 (0.76–1.42)	0.82
<120	0.73 (0.47–1.41)	0.17
Systolic BP (mmHg)
≥120–≤140	1.0	–
>140	1.77 (1.33–2.35)	<0.001
<70	0.97 (0.72–1.30)	0.84
Diastolic BP (mmHg)
≥70–≤80	1.0	–
>80	1.18 (0.82–1.70)	0.38
CCI
2–3	1.0	–
4–5	0.69 (0.52–0.91)	0.009
≥6	0.68 (0.46–1.0)	0.05
Individual CCI components
Cardiac (ref = absent)	0.53 (0.38–0.73)	<0.001
PVD (ref = absent)	0.58 (0.38–0.88)	0.007
Pulmonary (ref = absent)	0.80 (0.57–1.13)	0.22
Diabetes (ref = absent)	0.94 (0.72–1.22)	0.65
CVA (ref = absent)	0.75 (0.51–1.13)	0.16
Cancer (ref = absent)	1.41 (1.03–1.93)	0.04
Rheumatology (ref = absent)	0.31 (0.15–0.65)	0.0002
Other (ref = absent)	0.34 (0.18–0.66)	0.0002
Drug count (quintile)
1	–	–
2	1.36 (0.95–1.96)	0.1
3	0.94 (0.66–1.33)	0.72
4	0.99 (0.70–1.41)	0.94

Outcomes

Figure 1 shows the unadjusted mortality at 1 year for the three cohorts. The EQUAL cohort had a greater proportion of patients alive at 1 year (90.7%) compared with the SCC (85.0%) and PCC (69.6%) (log-rank <0.001).

FIGURE 1

Kaplan–Meier survival estimates of EQUAL, SCC and PCC.

Table 3 shows the output of the unadjusted and adjusted multivariable Cox regression models comparing all-cause mortality at the 1-year post-index date for patients in the PCC, SCC and EQUAL cohorts. In the unadjusted model, compared with EQUAL, the unadjusted hazard ratio (HR) of all-cause mortality was 1.7 [95% confidence interval (CI) 1.1–2.7; P = 0.02] and 3.5 (95% CI 2.1–5.7; P ≤ 0.001) in the SCC and PCC, respectively. In multivariable model 3, the HR decreased moderately upon adjustment for sociodemographics, laboratory variables and comorbidity.

Table 3.

Unadjusted and adjusted 1-year all-cause mortality for EQUAL, SCC and PCC patients

	Unadjusted model		Multivariable model 1		Multivariable model 2		Multivariable model 3
	Unadjusted model		(sociodemographics)		(Model 1 + laboratory variables)		(Model 2 + co-morbidity)
Cohort	HR (95% CI)	P-value	HR (95% CI)	P-value	HR (95% CI)	P-value	HR (95% CI)	P-value
EQUAL (n = 236)^a	1.0	–	1.0	–	1.0	–	1.0	–
Secondary care (n = 1203)^a	1.71 (1.10–2.65)	0.02	1.61 (1.03–2.52)	0.04	1.52 (0.97–2.38)	0.07	1.47 (0.94–2.31)	0.09
Primary care (n = 183)^a	3.48 (2.12–5.71)	<0.001	2.80 (1.65–4.75)	<0.001	2.52 (1.47–4.32)	0.001	2.41 (1.40–4.14)	0.001
Index age (years)
5-year bands	–	–	1.19 (1.08–1.30)	<0.001	1.17 (1.07–1.28)	0.001	1.18 (1.08–1.29)	0.001
Gender
Male (ref)	–	–	0.74 (0.58–0.94)	0.02	0.75 (0.59–0.96)	0.02	0.79 (0.62–1.01)	0.06
Townsend (quintile; 1 = least, 5 = most deprived)
1	–	–	1.0	–	1.0	–	1.0
2	–	–	0.91 (0.64–1.29)	0.60	0.87 (0.61–1.24)	0.45	0.88 (0.62–1.25)	0.48
3	–	–	0.97 (0.68–1.38)	0.85	0.95 (0.66–1.35)	0.76	0.95 (0.66–1.36)	0.77
4	–	–	0.93 (0.64–1.35)	0.70	0.91 (0.63–1.33)	0.63	0.90 (0.62–1.31)	0.58
5	–	–	1.07 (0.73–1.56)	0.74	1.07 (0.73–1.56)	0.75	1.04 (0.71–1.53)	0.83
Haemoglobin (g/dL)
[≥10 (ref), <10]	–	–	–	–	1.32 (1.00–1.75)	0.05	1.31 (0.99–1.74)	0.06
Albumin (g/L) [≥35 (ref), <35]	–	–	–	–	1.38 (1.06–1.81)	0.69	1.37 (1.04–1.79)	0.02
Systolic BP (mmHg)
10-mmHg bands	–	–	–	–	0.98 (0.90–1.07)	0.69	0.99 (0.91–1.08)	0.77
TVC^b	–	–	–	–	1.06 (1.00–1.13)	0.05	1.06 (1.00–1.13)	0.05
CCI
2–3	–	–	–	–	–	–	1.0	–
4–5	–	–	–	–	–	–	1.16 (0.88–1.53)	0.28
≥6	–	–	–	–	–	–	1.58 (1.13–2.19)	0.007

	Unadjusted model		Multivariable model 1		Multivariable model 2		Multivariable model 3
	Unadjusted model		(sociodemographics)		(Model 1 + laboratory variables)		(Model 2 + co-morbidity)
Cohort	HR (95% CI)	P-value	HR (95% CI)	P-value	HR (95% CI)	P-value	HR (95% CI)	P-value
EQUAL (n = 236)^a	1.0	–	1.0	–	1.0	–	1.0	–
Secondary care (n = 1203)^a	1.71 (1.10–2.65)	0.02	1.61 (1.03–2.52)	0.04	1.52 (0.97–2.38)	0.07	1.47 (0.94–2.31)	0.09
Primary care (n = 183)^a	3.48 (2.12–5.71)	<0.001	2.80 (1.65–4.75)	<0.001	2.52 (1.47–4.32)	0.001	2.41 (1.40–4.14)	0.001
Index age (years)
5-year bands	–	–	1.19 (1.08–1.30)	<0.001	1.17 (1.07–1.28)	0.001	1.18 (1.08–1.29)	0.001
Gender
Male (ref)	–	–	0.74 (0.58–0.94)	0.02	0.75 (0.59–0.96)	0.02	0.79 (0.62–1.01)	0.06
Townsend (quintile; 1 = least, 5 = most deprived)
1	–	–	1.0	–	1.0	–	1.0
2	–	–	0.91 (0.64–1.29)	0.60	0.87 (0.61–1.24)	0.45	0.88 (0.62–1.25)	0.48
3	–	–	0.97 (0.68–1.38)	0.85	0.95 (0.66–1.35)	0.76	0.95 (0.66–1.36)	0.77
4	–	–	0.93 (0.64–1.35)	0.70	0.91 (0.63–1.33)	0.63	0.90 (0.62–1.31)	0.58
5	–	–	1.07 (0.73–1.56)	0.74	1.07 (0.73–1.56)	0.75	1.04 (0.71–1.53)	0.83
Haemoglobin (g/dL)
[≥10 (ref), <10]	–	–	–	–	1.32 (1.00–1.75)	0.05	1.31 (0.99–1.74)	0.06
Albumin (g/L) [≥35 (ref), <35]	–	–	–	–	1.38 (1.06–1.81)	0.69	1.37 (1.04–1.79)	0.02
Systolic BP (mmHg)
10-mmHg bands	–	–	–	–	0.98 (0.90–1.07)	0.69	0.99 (0.91–1.08)	0.77
TVC^b	–	–	–	–	1.06 (1.00–1.13)	0.05	1.06 (1.00–1.13)	0.05
CCI
2–3	–	–	–	–	–	–	1.0	–
4–5	–	–	–	–	–	–	1.16 (0.88–1.53)	0.28
≥6	–	–	–	–	–	–	1.58 (1.13–2.19)	0.007

Multivariable Model 1 included adjustments for age, sex and Townsend deprivation quintile. Model 2 included an adjustment for haemoglobin, albumin and systolic BP in addition to the predictor variables included in Model 1. Model 3 included adjustments for all predictors included in Model 2 and CCI.

a

All the models included patients with 100% completeness for all variables.

b

Systolic BP was included as a TVC, as the variable was not proportional and the effect of systolic BP is likely to change over time.

Table 3.

Unadjusted and adjusted 1-year all-cause mortality for EQUAL, SCC and PCC patients

	Unadjusted model		Multivariable model 1		Multivariable model 2		Multivariable model 3
	Unadjusted model		(sociodemographics)		(Model 1 + laboratory variables)		(Model 2 + co-morbidity)
Cohort	HR (95% CI)	P-value	HR (95% CI)	P-value	HR (95% CI)	P-value	HR (95% CI)	P-value
EQUAL (n = 236)^a	1.0	–	1.0	–	1.0	–	1.0	–
Secondary care (n = 1203)^a	1.71 (1.10–2.65)	0.02	1.61 (1.03–2.52)	0.04	1.52 (0.97–2.38)	0.07	1.47 (0.94–2.31)	0.09
Primary care (n = 183)^a	3.48 (2.12–5.71)	<0.001	2.80 (1.65–4.75)	<0.001	2.52 (1.47–4.32)	0.001	2.41 (1.40–4.14)	0.001
Index age (years)
5-year bands	–	–	1.19 (1.08–1.30)	<0.001	1.17 (1.07–1.28)	0.001	1.18 (1.08–1.29)	0.001
Gender
Male (ref)	–	–	0.74 (0.58–0.94)	0.02	0.75 (0.59–0.96)	0.02	0.79 (0.62–1.01)	0.06
Townsend (quintile; 1 = least, 5 = most deprived)
1	–	–	1.0	–	1.0	–	1.0
2	–	–	0.91 (0.64–1.29)	0.60	0.87 (0.61–1.24)	0.45	0.88 (0.62–1.25)	0.48
3	–	–	0.97 (0.68–1.38)	0.85	0.95 (0.66–1.35)	0.76	0.95 (0.66–1.36)	0.77
4	–	–	0.93 (0.64–1.35)	0.70	0.91 (0.63–1.33)	0.63	0.90 (0.62–1.31)	0.58
5	–	–	1.07 (0.73–1.56)	0.74	1.07 (0.73–1.56)	0.75	1.04 (0.71–1.53)	0.83
Haemoglobin (g/dL)
[≥10 (ref), <10]	–	–	–	–	1.32 (1.00–1.75)	0.05	1.31 (0.99–1.74)	0.06
Albumin (g/L) [≥35 (ref), <35]	–	–	–	–	1.38 (1.06–1.81)	0.69	1.37 (1.04–1.79)	0.02
Systolic BP (mmHg)
10-mmHg bands	–	–	–	–	0.98 (0.90–1.07)	0.69	0.99 (0.91–1.08)	0.77
TVC^b	–	–	–	–	1.06 (1.00–1.13)	0.05	1.06 (1.00–1.13)	0.05
CCI
2–3	–	–	–	–	–	–	1.0	–
4–5	–	–	–	–	–	–	1.16 (0.88–1.53)	0.28
≥6	–	–	–	–	–	–	1.58 (1.13–2.19)	0.007

	Unadjusted model		Multivariable model 1		Multivariable model 2		Multivariable model 3
	Unadjusted model		(sociodemographics)		(Model 1 + laboratory variables)		(Model 2 + co-morbidity)
Cohort	HR (95% CI)	P-value	HR (95% CI)	P-value	HR (95% CI)	P-value	HR (95% CI)	P-value
EQUAL (n = 236)^a	1.0	–	1.0	–	1.0	–	1.0	–
Secondary care (n = 1203)^a	1.71 (1.10–2.65)	0.02	1.61 (1.03–2.52)	0.04	1.52 (0.97–2.38)	0.07	1.47 (0.94–2.31)	0.09
Primary care (n = 183)^a	3.48 (2.12–5.71)	<0.001	2.80 (1.65–4.75)	<0.001	2.52 (1.47–4.32)	0.001	2.41 (1.40–4.14)	0.001
Index age (years)
5-year bands	–	–	1.19 (1.08–1.30)	<0.001	1.17 (1.07–1.28)	0.001	1.18 (1.08–1.29)	0.001
Gender
Male (ref)	–	–	0.74 (0.58–0.94)	0.02	0.75 (0.59–0.96)	0.02	0.79 (0.62–1.01)	0.06
Townsend (quintile; 1 = least, 5 = most deprived)
1	–	–	1.0	–	1.0	–	1.0
2	–	–	0.91 (0.64–1.29)	0.60	0.87 (0.61–1.24)	0.45	0.88 (0.62–1.25)	0.48
3	–	–	0.97 (0.68–1.38)	0.85	0.95 (0.66–1.35)	0.76	0.95 (0.66–1.36)	0.77
4	–	–	0.93 (0.64–1.35)	0.70	0.91 (0.63–1.33)	0.63	0.90 (0.62–1.31)	0.58
5	–	–	1.07 (0.73–1.56)	0.74	1.07 (0.73–1.56)	0.75	1.04 (0.71–1.53)	0.83
Haemoglobin (g/dL)
[≥10 (ref), <10]	–	–	–	–	1.32 (1.00–1.75)	0.05	1.31 (0.99–1.74)	0.06
Albumin (g/L) [≥35 (ref), <35]	–	–	–	–	1.38 (1.06–1.81)	0.69	1.37 (1.04–1.79)	0.02
Systolic BP (mmHg)
10-mmHg bands	–	–	–	–	0.98 (0.90–1.07)	0.69	0.99 (0.91–1.08)	0.77
TVC^b	–	–	–	–	1.06 (1.00–1.13)	0.05	1.06 (1.00–1.13)	0.05
CCI
2–3	–	–	–	–	–	–	1.0	–
4–5	–	–	–	–	–	–	1.16 (0.88–1.53)	0.28
≥6	–	–	–	–	–	–	1.58 (1.13–2.19)	0.007

Multivariable Model 1 included adjustments for age, sex and Townsend deprivation quintile. Model 2 included an adjustment for haemoglobin, albumin and systolic BP in addition to the predictor variables included in Model 1. Model 3 included adjustments for all predictors included in Model 2 and CCI.

a

All the models included patients with 100% completeness for all variables.

b

Systolic BP was included as a TVC, as the variable was not proportional and the effect of systolic BP is likely to change over time.

Supplementary data, Table S3 shows the output of the unadjusted and adjusted negative binomial regression models comparing the number of hospitalizations at the 1-year post-index date for patients in the PCC, SCC and EQUAL cohorts. Patients in PCC [incidence rate ratio (IRR) 1.76 (95% CI 1.27–2.47)] and SCC [IRR 2.13 (95% CI 1.59–2.86)] had nearly more than twice the rate of hospital admissions compared with patients in the EQUAL cohort.

EQUAL had a higher proportion of patients starting RRT in the 1-year follow-up period after reaching an eGFR ≤20 mL/min/1.73 m² compared with those in the SCC (8.1% versus 2.1%; P < 0.001). There were no patients who started RRT in the PCC in this 1-year follow-up period.

DISCUSSION

This study examined whether patients participating in EQUAL were similar to ‘real-world’ patients with an eGFR ≤20 mL/min/1.73 m² regarding baseline characteristics, survival and hospitalization. Patients in EQUAL were more likely to be younger, male and from an urban setting compared with the PCC and SCC patients. EQUAL patients were also less likely to have cardiovascular, peripheral vascular and rheumatic diseases. EQUAL patients were more likely to start RRT and had a greater probability of being alive at 1 year compared with PCC and SCC patients. The overall better health of EQUAL patients meant that they were less likely to be admitted to hospital for illnesses.

There were decreasing odds of participation in EQUAL for every 5-year age band increase. It has been recognized that patients recruited into a study may differ from the target population and be younger and healthier than referred and registry patients [26, 27]. This is a common problem in research, with a middle-aged group of patients more likely to be enrolled in studies and patients at the extremes of ages (youngest and the oldest groups) less likely to participate [28]. Hence the study sample is less likely to include the elderly [29, 30], who have a higher burden of comorbidity and therefore higher expected mortality [31]. Such patients may also differ from younger participants regarding treatment effects. The implications of this are that ‘evidence-based’ research findings based on younger patients are applied to elderly patients with comorbidities through clinical practice guidelines [32]. Health research should therefore be conducted in the populations most affected by high disease prevalence [33]. Solutions such as liberal inclusion criteria, improved communication, reducing respondent burden, provision of travel support and data collection at home may facilitate the participation of older people in research [34, 35]. Unfortunately, despite these measures, as older patients increase as a proportion of the population, those who agree to participate in randomized controlled trials and observational studies may be less representative of the population.

Women were less likely to be represented in EQUAL in the UK, with only 40.0% of participants in EQUAL being women, compared with 48.6 and 65.2% in the SCC and PCC, respectively. A probable explanation for a lower proportion of women in the EQUAL cohort could be due to slow progression rates in women [36]. The slower progression rate means that there will be a smaller cohort of women reaching an incident eGFR of ≤20 mL/min/1.73 m² or commencing RRT. The variation in gender seen in EQUAL can be explained by the variation in the incidence of CKD among men and women, with a higher incidence of CKD in women but a lower incidence of progression to ESKD requiring RRT [37]. A large European registry study by Antlanger et al. [38] assessed sex-specific differences in RRT incidence and prevalence using data from nine countries and showed that the incidence and prevalence rates were consistently higher in men than women. The recruitment of women in research studies is an essential issue for researchers. Medical research results cannot be extrapolated between genders, as the pathophysiological process varies. For example, cardiovascular disease and some of the cancers are affected by hormones. As a result, much of our understanding of illnesses and its treatments are based on research conducted disproportionately with men [39]. Alternatively, women are no more likely than men to decline to participate in studies but are merely underrepresented in target populations [40].

In the univariable logistic regression model, higher comorbidity was associated with lower odds of participation in EQUAL. The findings of this study are consistent with prior reports in other study designs showing that patients participating in trials have better survival not only on account of being healthier, but perhaps also reflecting the better medical oversight [41–43].

There was a greater proportion of EQUAL participants starting RRT compared with SCC patients. The potential explanation for this finding could be that they represented a cohort of patients who had a quicker rate of progression of their kidney disease and therefore formed a cohort of patients who were chosen to be studied. This is necessarily not a limitation of EQUAL, but the results cannot be generalized to all patients with an eGFR <20 mL/min/1.73 m².

Patients in the PCC and SCC had nearly twice the rate of hospital admissions compared with patients in EQUAL. EQUAL hospitalization data came from the nurse-collected CRFs, whereas the THIN data came from the HES linkage. It could be that the HES linkage identified more hospital admissions. An alternate explanation for this finding could be attributed to the source of the hospitalization data.

In observational studies, the classification errors, selection bias and uncontrolled confounders and the uncertainty introduced by these types of biases are seldom quantified. When designing a study, incorporating a comparison between the experimental and eligible study population at the same time would enhance understanding of the generalizability of future studies. This was done in the North American Atherosclerosis Risk in Communities study, where generalizability was examined by nesting study patients in communities covered by broad surveillance [44]. Alternatively, embedding trials/studies within chronic disease registries will allow generalizability to be ascertained. The International Society of Nephrology International Network of CKD cohort studies initiative, which includes 12 prospective cohort studies and two registries covering 21 countries, will play a significant role in understanding the generalizability of current and future CKD research [45]. Accrual to clinical trials is an initiative created to improve the efficiency of clinical trials by effectively identifying eligible participants in the recruitment stage of a study and therefore might play a crucial role in improving the generalizability at the recruitment stage of future studies [46]. Finally, using statistical techniques such as probabilistic sensitivity analysis [Episens model (st0138)] in the analysis stage may help to quantify the effect of bias and thus researchers can report results that take into account the systematic errors and hence avoid overstating their certainty about the effect under study [47, 48].

The strengths of this study are the use of routinely collected generalizable GP data (THIN) to understand the generalizability of an observational cohort study [14]. This has not necessitated the recruiting of patients who have declined to participate in a study and overcomes the complex ethical issues of re-approaching patients who have already refused to take part in a study. In the era of ‘big data’, research using routinely collected data offers more significant potential and has underpinned research in recent years [49]. The strengths of GP data are that they are population-based and are derived from a representative subset of the population [14, 50].

There were several limitations to this study. Identification of the appropriate comparison control group was crucial to an inference of the study, as any observational design will always be limited by unmeasured confounding [51]. Although this study did not directly assess the generalizability of EQUAL data by understanding the differences between EQUAL agreed and EQUAL declined patients, routinely collected data has shown the differences in EQUAL patients and patients in secondary care meeting the same eligibility criteria. There is also the potential for multiple biases as a result of differences in data capture methods between THIN and EQUAL and resultant misclassification of the THIN subjects.

This article provides empirical evidence concerning how participants in a carefully conducted observational cohort study differ from the broader population of patients that they are intended to represent. Older and sicker patients were less likely to be recruited into EQUAL in the UK, and this was supported by follow-up data on health outcomes, with patients in EQUAL more likely to be hospitalized and alive at 12 months. This selection pattern is likely to be found in most observational studies of chronic diseases. These issues can be overcome by designing observational studies to be embedded within disease registries or by using novel statistical techniques in the analysis.

SUPPLEMENTARY DATA

Supplementary data are available at ndt online.

ACKNOWLEDGEMENTS

The authors would like to thank all the patients and health professionals participating in the EQUAL study. We would also like to thank the local investigators. Funding was received from the European Renal Association–European Dialysis and Transplant Association (ERA-EDTA); Svenska L€akares€allskapet (SLS-248981, SLS-503991); the Stockholm County Council (20140020), Njurfonden (Sweden); the Italian Society of Nephrology; the Dutch Kidney Foundation (SB 142); the Young Investigators grant in Germany and the National Institute for Health Research (NIHR) in the UK.

CONFLICT OF INTEREST STATEMENT

C.W. reports grants from Sanofi; personal fees from Sanofi, Takeda, Chiesi, Amicus and Idorsia; grants from Idorsia and Boehringer-Ingelheim; personal fees from Lilly, Merck Sharp & Dohme, Mundipharma, Glaxo Smith Kline, Boehringer-Ingelheim, AstraZeneca, Bayer, Reata, Akebia and Triceda, outside the present work. M.E. reports personal fees from Astellas, AstraZeneca and Vifor Pharma and non-financial support from Baxter Healthcare, outside the submitted work. K.J.J. reports grants from ERA-EDTA, during the conduct of the study. F.J.C. reports grants from the NIHR and Kidney Research UK and personal fees from Baxter, outside the submitted work. The rest of the authors have no conflicts of interest to report.

REFERENCES

1

Øvretveit

J

,

Leviton

L

,

Parry

G.

Increasing the generalisability of improvement research with an improvement replication programme

.

BMJ Qual Saf

2011

;

20

(

Suppl 1

):

87

–

91

2

Kukull

WA

,

Ganguli

M.

Generalizability: the trees, the forest, and the low-hanging fruit

.

Neurology

2012

;

78

:

1886

–

1891

3

Ahmad

N

,

Boutron

I

,

Dechartres

A

et al.

Applicability and generalisability of the results of systematic reviews to public health practice and policy: a systematic review

.

Trials

2010

;

11

:

20

4

Glasgow

RE

,

Lichtenstein

E

,

Marcus

AC.

Why don’t we see more translation of health promotion research to practice? Rethinking the efficacy-to-effectiveness transition

.

Am J Public Health

2003

;

93

:

1261

–

1267

5

Steckler

A

,

McLeroy

KR.

The importance of external validity

.

Am J Public Health

2008

;

98

:

9

–

10

6

Tarlo

SM

,

Chan-Yeung

M.

Importance of definitions and population selection in work-related asthma

.

Can Respir J

2013

;

20

:

156

–

156

7

Hurtado-Chong

A

,

Joeris

A

,

Hess

D

et al.

Improving site selection in clinical studies: a standardised, objective, multistep method and first experience results

.

BMJ Open

2017

;

7

:

e014796

8

Gheorghe

A

,

Roberts

TE

,

Ives

JC

et al.

Centre selection for clinical trials and the generalisability of results: a mixed methods study

.

PLoS One

2013

;

8

:

e56560

9

Symmons

DP

,

Barrett

EM

,

Bankhead

CR

et al.

The incidence of rheumatoid arthritis in the United Kingdom: results from the Norfolk Arthritis Register

.

Br J Rheumatol

1994

;

33

:

735

–

739

10

Ards

S

,

Chung

C

,

Myers

SL

Jr.

The effects of sample selection bias on racial differences in child abuse reporting

.

Child Abuse Negl

1998

;

22

:

103

–

115

11

Foulkes

MA

,

Wolf

PA

,

Price

TR

et al.

The Stroke Data Bank: design, methods, and baseline characteristics

.

Stroke

1988

;

19

:

547

–

554

12

Jager

KJ

,

Ocak

G

,

Drechsler

C

et al.

The EQUAL study: a European study in chronic kidney disease stage 4 patients

.

Nephrol Dial Transplant

2012

;

27

:

iii27

–

31

13

Blak

BT

,

Hards

M

,

Lee

J.

PMC16 How do thin death data compare to national figures for each UK country?

Value Health

2010

;

13

:

A331

14

Blak

BT

,

Thompson

M

,

Dattani

H

et al.

Generalisability of The Health Improvement Network (THIN) database: demographics, chronic disease prevalence and mortality rates

.

Inform Prim Care

2011

;

19

:

251

–

255

PubMed

OpenURL Placeholder Text

15

Denburg

MR

,

Haynes

K

,

Shults

J

et al.

Validation of The Health Improvement Network (THIN) database for epidemiologic studies of chronic kidney disease

.

Pharmacoepidemiol Drug Saf

2011

;

20

:

1138

–

1149

16

Hemmelgarn

BR

,

Manns

BJ

,

Quan

H

et al.

Adapting the Charlson Comorbidity Index for use in patients with ESRD

.

Am J Kidney Dis

2003

;

42

:

125

–

132

17

Springate

DA

,

Kontopantelis

E

,

Ashcroft

DM

et al.

Clinical codes: an online clinical codes repository to improve the validity and reproducibility of research using electronic medical records

.

PLoS One

2014

;

9

:

e99825

18

Khan

NF

,

Perera

R

,

Harper

S

et al.

Adaptation and validation of the Charlson Index for Read/OXMIS coded databases

.

BMC Fam Pract

2010

;

11

:

1

19

Fried

L

,

Bernardini

J

,

Piraino

B.

Comparison of the Charlson Comorbidity Index and the Davies score as a predictor of outcomes in PD patients

.

Perit Dial Int

2003

;

23

:

568

–

573

20

Miskulin

DC

,

Martin

AA

,

Brown

R

et al.

Predicting 1 year mortality in an outpatient haemodialysis population: a comparison of comorbidity instruments

.

Nephrol Dial Transplant

2004

;

19

:

413

–

420

21

Ho

AM

,

Dion

PW

,

Ng

CS

et al.

Understanding immortal time bias in observational cohort studies

.

Anaesthesia

2013

;

68

:

126

–

130

22

Jager

KJ

,

Zoccali

C

,

Macleod

A

et al.

Confounding: what it is and how to deal with it

.

Kidney Int

2008

;

73

:

256

–

260

23

Cox

DR.

Regression models and life-tables

.

J R Stat Soc Ser B Stat Methodol

1972

;

34

:

187

–

220

OpenURL Placeholder Text

24

Gail

MH

,

Graubard

B

,

Williamson

DF

et al.

Comments on ‘Choice of time scale and its effect on signiﬁcance of predictors in longitudinal studies’

.

Stat Med

2009

;

28

:

1315

–

1317

25

Cameron

AC

,

Trivedi

PK.

Regression Analysis of Count Data

.

Cambridge

:

Cambridge University Press

,

1998

26

Kennedy

WA

,

Laurier

C

,

Malo

JL

et al.

Does clinical trial subject selection restrict the ability to generalise use and cost of health services to “real life” subjects?

Int J Technol Assess Health Care

2003

;

19

:

8

–

16

27

Ganguli

M

,

Lytle

ME

,

Reynolds

MD

et al.

Random versus volunteer selection for a community-based study

.

J Gerontol A Biol Sci Med Sci

1998

;

53A

:

M39

–

M46

28

Manne

SL

,

Ostroff

JS

,

Winkel

G

et al.

Couple-focused group intervention for women with early stage breast cancer

.

J Consult Clin Psychol

2005

;

73

:

634

–

646

29

Swenson

WM.

Sample selection bias in clinical research

.

Psychosomatics

1980

;

21

:

291

–

292

30

Turazza

FM

,

Franzosi

MG.

Is anticoagulation therapy underused in elderly patients with atrial fibrillation?

Drugs Aging

1997

;

10

:

174

–

184

31

Fernandez-Merino

MC

,

Rey-Garcia

J

,

Tato

A

et al. [

Self-perception of health and mortality in elderly from a rural community]

.

Aten Primaria

2000

;

25

:

459

–

463

32

Boyd

CM

,

Darer

J

,

Boult

C

et al.

Clinical practice guidelines and quality of care for older patients with multiple comorbid diseases: implications for pay for performance

.

JAMA

2005

;

294

:

716

–

724

33

Bower

P

,

Grigoroglou

C

,

Anselmi

L

et al.

Is health research undertaken where the burden of disease is greatest? Observational study of geographical inequalities in recruitment to research in England 2013–2018

.

BMC Med

2020

;

18

:

133

34

Denson

AC

,

Mahipal

A.

Participation of the elderly population in clinical trials: barriers and solutions

.

Cancer Control

2014

;

21

:

209

–

214

35

Mody

L

,

Miller

DK

,

McGloin

JM

et al.

Recruitment and retention of older adults in aging research

.

J Am Geriatr Soc

2008

;

56

:

2340

–

2348

36

Navaneethan

SD

,

Kandula

P

,

Jeevanantham

V

et al.

Referral patterns of primary care physicians for chronic kidney disease in general population and geriatric patients

.

Clin Nephrol

2010

;

73

:

260

–

267

37

Fernandez-Prado

R

,

Fernandez-Fernandez

B

,

Ortiz

A.

Women and renal replacement therapy in Europe: lower incidence, equal access to transplantation, longer survival than men

.

Clin Kidney J

2018

;

11

:

1

–

6

38

Antlanger

M

,

Noordzij

M

,

van de Luijtgaarden

M

et al.

Sex differences in kidney replacement therapy initiation and maintenance

.

Clin J Am Soc Nephrol

2019

;

14

:

1616

–

1625

39

Longenecker

J

,

Genderson

J

,

Dickinson

D

et al.

Where have all the women gone: participant gender in epidemiological and non-epidemiological research of schizophrenia

.

Schizophr Res

2010

;

119

:

240

–

245

40

Covell

NH

,

Frisman

LK

,

Essock

SM.

Rates of refusal to participate in research studies among men and women

.

Psychiatr Serv

2003

;

54

:

1541

–

1544

41

Goyal

J

,

Nuhn

P

,

Huang

P

et al.

The effect of clinical trial participation versus non-participation on overall survival in men receiving first-line docetaxel-containing chemotherapy for metastatic castration-resistant prostate cancer

.

BJU Int

2012

;

110

:

E575

–

E582

42

Du Bois

A

,

Rochon

J

,

Lamparter

C

et al.

Pattern of care and impact of participation in clinical studies on the outcome in ovarian cancer

.

Int J Gynecol Cancer

2005

;

15

:

183

–

191

43

Hutchins

LF

,

Unger

JM

,

Crowley

JJ

et al.

Underrepresentation of patients 65 years of age or older in cancer-treatment trials

.

N Engl J Med

1999

;

341

:

2061

–

2067

44

The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. The ARIC investigators

.

Am J Epidemiol

1989

;

129

:

687

–

702

PubMed

45

Dienemann

T

,

Fujii

N

,

Orlandi

P

et al.

International Network of Chronic Kidney Disease cohort studies (iNET-CKD): a global network of chronic kidney disease cohorts

.

BMC Nephrol

2016

;

17

:

121

46

Visweswaran

S

,

Becich

MJ

,

D’Itri

VS

et al.

Accrual to clinical trials (ACT): a clinical and translational science award consortium network

.

JAMIA Open

2018

;

1

:

147

–

152

47

Orsini

N

,

Bellocco

R

,

Bottai

M

et al.

A tool for deterministic and probabilistic sensitivity analysis of epidemiologic studies

.

Stat J

2008

;

8

:

29

–

48