Abstract

Background.

We conducted a systematic review and meta-analysis to better define the prognostic ability of fluorine-18-fluorodeoxyglucose positron emission tomography (18F-FDG PET) following salvage chemotherapy for relapsed or refractory Hodgkin's lymphoma (HL) and aggressive non-Hodgkin's lymphoma.

Methods.

We searched PubMed (from inception to January 31, 2010), bibliographies, and review articles without language restriction. Two assessors independently assessed study characteristics, quality, and results. We performed a meta-analysis to determine prognostic accuracy.

Results.

Twelve studies including 630 patients were eligible. The most commonly evaluated histologies were diffuse large B-cell lymphoma (n = 313) and HL (n = 187), which were typically treated with various salvage and high-dose chemotherapy regimens. Studies typically employed nonstandardized protocols and diagnostic criteria. The prognostic accuracy was heterogeneous across the included studies. 18F-FDG PET had a summary sensitivity of 0.69 (95% confidence interval [CI], 0.56–0.81) and specificity of 0.81 (95% CI, 0.73–0.87). The summary estimates were stable in sensitivity analyses. In four studies that performed direct comparisons between PET and conventional restaging modalities, PET had a superior accuracy for predicting treatment outcomes. Subgroup and metaregression analyses did not identify any particular factor to explain the observed heterogeneity.

Conclusion.

18F-FDG PET performed after salvage therapy appears to be an appropriate test to predict treatment failure in patients with refractory or relapsed lymphoma who receive high-dose chemotherapy. Some evidence suggests PET is superior to conventional restaging for this purpose. Given the methodological limitations in the primary studies, prospective studies with standardized methodologies are needed to confirm and refine these promising results.

Introduction

Advances in chemotherapy regimens have established Hodgkin's lymphoma (HL) and aggressive non-Hodgkin's lymphoma (NHL) as potentially curable malignancies [1, 2]. Despite progress in the treatment of these diseases, substantial proportions of patients remain refractory to—or relapse after—standard first-line chemotherapy. In these patients, salvage chemotherapy followed by high-dose consolidation chemotherapy with autologous hematopoietic stem cell transplantation constitutes the treatment of choice. However, relapse after high-dose therapy is not uncommon and the procedure is associated with substantial short-term morbidity and mortality, emphasizing the importance of identifying markers to determine disease prognosis and guide risk-oriented management [3, 4]. Although prognostic models based on clinical and laboratory characteristics have been proposed, their limited predictive accuracy needs to establish through the identification of more reliable prognostic markers in clinical practice [5, 6].

Fluorine-18-fluorodeoxyglucose positron emission tomography (18F-FDG PET) is an established functional imaging modality, routinely used for the staging and post-therapy response assessment of patients with HL and diffuse large B-cell lymphoma (DLBCL) [710]. 18F-FDG PET performed after a few cycles of first-line chemotherapy (interim response assessment) is a promising candidate predictive marker for treatment outcomes and could be used for implementing response-tailored treatment strategies [11]. In the relapsed or refractory disease setting, several studies have evaluated the prognostic value of 18F-FDG PET when performed between salvage and consolidation high-dose therapy with stem cell transplantation; the sample sizes of these studies were, however, small, resulting in imprecise estimates of the test's prognostic value, particularly with respect to specific lymphoma subtypes [12]. Furthermore, studies have used heterogeneous designs and protocols for PET assessment, making interpretation of the published data difficult.

To better define the prognostic value of 18F-FDG PET following salvage chemotherapy before high-dose chemotherapy for patients with relapsed or refractory HL and aggressive NHL, we conducted a systematic review of studies assessing its accuracy in predicting treatment outcomes. Using meta-analysis, we summarize the results of studies across different treatment settings and lymphoma subtypes, and estimated the effects of a positive 18F-FDG PET scan on progression-free survival (PFS).

Methods

Search Strategy, Study Eligibility, and Data Abstraction

We searched PubMed from inception through January 31, 2010 with no language restrictions. Exact search strategies can be found in online Appendix 1. To complement the search, we examined the reference lists of eligible studies and relevant review articles.

Two reviewers (T.T., I.J.D.) independently screened abstracts and further examined full-text articles of all potentially eligible citations. Studies that assessed 18F-FDG PET for patients with lymphoma during or after induction chemotherapy and before high-dose chemotherapy followed by autologous stem cell transplantation were considered eligible. We included both prospective and retrospective studies, and we considered clinical follow-up with or without pathologic confirmation to be the reference standard. We included studies that evaluated ≥10 patients and included at least five patients who experienced disease progression or relapse after high-dose chemotherapy. When a study included patients who were evaluated with a gallium scan together with those with PET, we included it only if subgroup data on PET were separately extractable. Likewise, when a study included patients who underwent allogeneic transplantation, we included it only if data on those who underwent autologous transplantation were separately extractable. We excluded studies that did not provide adequate information to allow the calculation of sensitivity and specificity, or hazard ratios with their variance for predicting treatment failures. We excluded editorials, comments, letters, and review articles.

One reviewer (T.T.) extracted descriptive data from each eligible study, which were confirmed by another reviewer (I.J.D.). We extracted the following information from eligible studies: first author, year of publication, journal, patient demographic and clinical characteristics such as the International Prognostic Score for advanced-stage HL or the International Prognostic Index for NHL, therapeutic interventions, technical specifications of 18F-FDG PET, and interpretation of PET results. A third reviewer (T.N.) also verified the data on the technical specification and interpretation of 18F-FDG PET. Two reviewers (T.T., I.J.D.) independently extracted data regarding treatment outcomes. If a study reported PET results at multiple time points during induction chemotherapy, we recorded the results of the scan performed closest to high-dose chemotherapy. When studies performed a direct comparison between PET and “conventional” restaging modalities (i.e., computed tomography [CT] or magnetic resonance imaging, and bone marrow biopsy), we also recorded data to assess the diagnostic accuracy of conventional restaging. We defined a direct comparison as the performance of conventional restaging at the same time point in at least 90% of patients who had a PET scan.

Quality Assessment

To assess the quality and reporting of studies, we evaluated 15 items that were considered relevant to the review topic, based on the Quality Assessment of Diagnostic Accuracy Studies instrument and the Reporting Recommendation for Tumor Marker Prognostic Studies guidelines [13, 14]. Online Appendix Table 1 describes how we rated each methodological item. Two reviewers (T.T., I.J.D.) independently assessed the quality items, and discrepancies were resolved by consensus.

Data Synthesis

For each study, we constructed a 2 × 2 contingency table consisting of true-positive (TP), false-positive (FP), false-negative (FN), and true-negative results, whereby all patients were categorized according to whether they were PET positive or negative, and whether they experienced treatment failure after high-dose chemotherapy. For the main analysis, we used the entire clinical follow-up as the reference standard for treatment failure diagnosis, and counted censorings as no treatment failure regardless of the duration of follow-up. Also, we considered minimal residual uptake (MRU), when defined, as a positive finding based on the reported diagnostic criteria. Regarding conventional restaging, we counted only complete remission as negative, and considered any residual mass or lesion to be positive, regardless of its size.

We calculated sensitivity and specificity for each study, and then estimated summary sensitivity and specificity with their corresponding 95% confidence interval (CIs) using bivariate random effects meta-analysis [1517]. Summary positive and negative likelihood ratios (LRs) were calculated from the summary sensitivity and specificity estimates [18]. We assessed between-study heterogeneity visually, by plotting sensitivity and specificity in the receiver operating characteristic (ROC) space. We also drew summary ROC curves and confidence regions for summary sensitivity and specificity [1517]. As a global measure for the summary ROC curves, we estimated the Q* statistic, the point on the ROC curve where sensitivity and specificity are equal.

To explore heterogeneity, we performed subgroup analysis based on lymphoma histology (HL, NHL, and DLBCL), treatment setting of induction chemotherapy (first-line therapy versus salvage therapy), and whether MRU was separately categorized. To further explore whether study-level characteristics could explain between-study heterogeneity we performed univariate metaregression analyses within a hierarchical summary ROC model [16]. We assessed the following, a priori selected, covariates: year of publication, study design (prospective versus retrospective), study size, proportion of second-line patients, proportion of patients with HL, and relapse rate.

We performed sensitivity analyses to explore the effects of MRU and early censorings during the first year after high-dose chemotherapy on the summary estimates. In a sensitivity analysis, we categorized MRU as negative. Regarding early censorings, we first excluded them regardless of PET results. In a “best-case” scenario, we counted early censored patients with positive PET scans as TP. Conversely, in a “worst-case” scenario we counted early censored patients with negative PET scans as FN. For these “best-case” and “worst-case” scenarios, when a study did not report early censorings for PET+ and PET patients, we imputed the number of censored cases based on the highest censoring rate observed in each scenario.

To better account for time-to-event data, we also estimated the hazard ratio (HR) comparing PFS between the PET+ and PET groups. The unadjusted HR was preferred over the adjusted HR. If the HR and its variance were not directly extractable from a study, we calculated them from reported statistics using a prespecified algorithm [19]. We estimated a summary HR by random-effects meta-analysis [20]. We quantified between-study heterogeneity with the I2 statistic [21].

Analyses were conducted using STATA, version 10.1/SE (Stata Corp, College Station, TX), and SAS, version 9.2 (SAS Institute Inc., Cary, NC). All tests were two-sided and statistical significance was defined as a p-value < .05.

Results

Study Selection and Characteristics

Our literature search identified 2,327 citations, of which 28 were considered potentially eligible and were retrieved for further assessment. After excluding 16 publications, we identified 12 studies eligible for this review (Fig. 1) [2233]. A complete list of excluded studies and reasons for exclusion are available in online Appendix 1.

Study flow diagram.
Figure 1

Study flow diagram.

Abbreviation: PET, positron emission tomography.

The eligible studies evaluated pre–high-dose PET for a total of 630 patients, most of whom (n = 541; 87%) had refractory or relapsed lymphoma (Table 1). Seven studies exclusively included patients undergoing second-line treatment for relapsed or refractory lymphoma [26, 2833], and five reported on mixed first-line and second-line patient populations [2225, 27]. Although most studies assessed PET after induction chemotherapy just before high-dose therapy, the number of chemotherapy cycles before PET ranged from two to nine cycles. Nine studies (75%) had a retrospective design. Typically, studies followed up patients for 1–2 years.

Table 1

Study characteristics of studies of pre–high-dose chemotherapy for lymphoma

aOnly recurrent or refractory cases.

bFor patients in continuous remission.

c8 weeks in the case of radiation.

dThe median interval between the start of salvage chemotherapy and PET was 4 weeks (range, 2–5 weeks), and the median interval between PET and ASCT was 7 weeks.

eInclusion criteria. The total reported population includes 6 ineligible patients (10%): recipients of allogeneic stem cell transplantation. The median PFS for PET+ patients was 13 months (n = 30) and the median follow-up for PET patients in continuous remission was 50 months (n = 25).

fInterval between PET and ASCT was ≤3 months (defined as an inclusion criterion).

gFor the entire study cohort (n = 211), not for the population of interest; 174 patients (82%) were excluded from analysis because they were evaluated with gallium scan.

hMedian, 2 cycles.

iWithin 2 weeks (median, 9 days) after completing second-line chemotherapy.

Abbreviations: ASCT, autologous stem cell transplantation; HDCT, high-dose chemotherapy; I, interim induction therapy; I/P, interim or postinduction chemotherapy; ND, no data; P, postinduction therapy; PET, positron emission tomography.

Table 1

Study characteristics of studies of pre–high-dose chemotherapy for lymphoma

aOnly recurrent or refractory cases.

bFor patients in continuous remission.

c8 weeks in the case of radiation.

dThe median interval between the start of salvage chemotherapy and PET was 4 weeks (range, 2–5 weeks), and the median interval between PET and ASCT was 7 weeks.

eInclusion criteria. The total reported population includes 6 ineligible patients (10%): recipients of allogeneic stem cell transplantation. The median PFS for PET+ patients was 13 months (n = 30) and the median follow-up for PET patients in continuous remission was 50 months (n = 25).

fInterval between PET and ASCT was ≤3 months (defined as an inclusion criterion).

gFor the entire study cohort (n = 211), not for the population of interest; 174 patients (82%) were excluded from analysis because they were evaluated with gallium scan.

hMedian, 2 cycles.

iWithin 2 weeks (median, 9 days) after completing second-line chemotherapy.

Abbreviations: ASCT, autologous stem cell transplantation; HDCT, high-dose chemotherapy; I, interim induction therapy; I/P, interim or postinduction chemotherapy; ND, no data; P, postinduction therapy; PET, positron emission tomography.

Overall, the two most commonly evaluated histologies were DLBCL (n = 313; 50%) and HL (n = 187; 30%) (Table 2). Therapeutically more challenging lymphomas, such as indolent lymphomas (n = 66; 10%) and other aggressive NHLs (n = 62; 10%), were also included. Two studies exclusively evaluated patients with HL [26, 30], and one study focused only on DLBCL patients [29]. In general, patients had a wide range of risk for treatment failure by commonly used prognostic scores such as the age-adjusted international prognostic index at second-line therapy [5] or the prognostic score for relapsed Hodgkin's lymphoma [6]. Studies employed diverse chemotherapy regimens both for the induction and high-dose components of treatment.

Table 2

Clinical characteristics of studies of pre–high-dose chemotherapy for lymphoma

aOnly refractory or relapsed patients who were evaluated with FDG-PET during salvage therapy before high-dose chemotherapy followed by autologous stem cell rescue were considered.

bData include 6 irrelevant patients who underwent allogeneic stem cell transplantation.

c“Many” patients received rituximab as part of salvage therapy.

dData include 24 irrelevant patients who failed to undergo high-dose chemotherapy followed by autologous stem cell rescue.

eSeven nonresponders to DHAP-VIM also received mini-BEAM as salvage therapy.

fRituximab was routinely used since 2000. Those patients who participated before did not receive rituximab.

gData include 143 irrelevant patients who were evaluated with gallium scan only.

h75% of patients were ≤60 years old.

i52% of patients also received rituximab after autologous stem cell transplantation.

Abbreviations: AAIPI, age-adjusted international prognostic index; ASHAP, doxorubicin, methylprednisolone, cytarabine, cisplatin; BEAM, carmustine, etoposide, cytarabine, melphalan; BL, Burkitt's lymphoma; CBV, cyclophosphamide, carmustine, etoposide; ChlVIPP, chlorambucil, vinblastine, procarbazine, prednisone; CHOP, cyclophosphamide, doxorubicin, vincristine, prednisone; CHVmP-BV, cyclophosphamide, doxorubicin, teniposide, prednisone, bleomycin, vincristine; CV, cyclophosphamide, etoposide; Cy, cyclophosphamide; Dexa, dexamethasone; DHAP, dexamethasone, cytarabine, cisplatin; DLBCL, diffuse large B-cell lymphoma; ESHAP, etoposide, methylprednisolone, cytarabine, cisplatin; FDG-PET, fluorodeoxyglucose positron emission tomography; FL, follicular lymphoma; GemCis, gemcitabine, dexamethasone, cisplatin; gr, grade; H, high; HDT, high-dose chemotherapy; H-I, high intermediate; HL, Hodgkin's lymphoma; ICE, ifosfamide, carboplatin, etoposide; IEV, ifosfamide, epirubicin, etoposide; IFRT, involved-field radiotherapy; IGEV, ifosfamide, gemcitabine, vinorelbine, prednisone; IPI, international prognostic index; IPS, international prognostic score; L, low; L-I, low intermediate; MCL, mantle cell lymphoma; MOPP-ABV, nitrogen mustard, vincristine, procarbazine, prednisone, doxorubicin, bleomycin, vinblastine; MZL, marginal zone lymphoma; NHL, non-Hodgkin's lymphoma; ND, no data; P, primary therapy; PSRHL, prognostic score for relapsed Hodgkin's lymphoma reported by Josting et al. [6]; pt, point; PTCL, peripheral T-cell lymphoma; S, salvage therapy; sAAIPI, age-adjusted international prognostic index at second-line therapy reported by Hamlin et al. [5]; Stanford V, vinblastine, doxorubicin, vincristine, bleomycin, nitrogen mustard, etoposide, prednisone; TBI, total body irradiation; tDLBCL, diffuse large B-cell lymphoma transformed from indolent B-cell lymphoma; TM, thiotepa, melphalan; VIM, etoposide, ifosfamide, methotrexate; VP16-CDDP, etoposide, cisplatin; Z, yttrium-90 ibritumomab tiuxetan.

Table 2

Clinical characteristics of studies of pre–high-dose chemotherapy for lymphoma

aOnly refractory or relapsed patients who were evaluated with FDG-PET during salvage therapy before high-dose chemotherapy followed by autologous stem cell rescue were considered.

bData include 6 irrelevant patients who underwent allogeneic stem cell transplantation.

c“Many” patients received rituximab as part of salvage therapy.

dData include 24 irrelevant patients who failed to undergo high-dose chemotherapy followed by autologous stem cell rescue.

eSeven nonresponders to DHAP-VIM also received mini-BEAM as salvage therapy.

fRituximab was routinely used since 2000. Those patients who participated before did not receive rituximab.

gData include 143 irrelevant patients who were evaluated with gallium scan only.

h75% of patients were ≤60 years old.

i52% of patients also received rituximab after autologous stem cell transplantation.

Abbreviations: AAIPI, age-adjusted international prognostic index; ASHAP, doxorubicin, methylprednisolone, cytarabine, cisplatin; BEAM, carmustine, etoposide, cytarabine, melphalan; BL, Burkitt's lymphoma; CBV, cyclophosphamide, carmustine, etoposide; ChlVIPP, chlorambucil, vinblastine, procarbazine, prednisone; CHOP, cyclophosphamide, doxorubicin, vincristine, prednisone; CHVmP-BV, cyclophosphamide, doxorubicin, teniposide, prednisone, bleomycin, vincristine; CV, cyclophosphamide, etoposide; Cy, cyclophosphamide; Dexa, dexamethasone; DHAP, dexamethasone, cytarabine, cisplatin; DLBCL, diffuse large B-cell lymphoma; ESHAP, etoposide, methylprednisolone, cytarabine, cisplatin; FDG-PET, fluorodeoxyglucose positron emission tomography; FL, follicular lymphoma; GemCis, gemcitabine, dexamethasone, cisplatin; gr, grade; H, high; HDT, high-dose chemotherapy; H-I, high intermediate; HL, Hodgkin's lymphoma; ICE, ifosfamide, carboplatin, etoposide; IEV, ifosfamide, epirubicin, etoposide; IFRT, involved-field radiotherapy; IGEV, ifosfamide, gemcitabine, vinorelbine, prednisone; IPI, international prognostic index; IPS, international prognostic score; L, low; L-I, low intermediate; MCL, mantle cell lymphoma; MOPP-ABV, nitrogen mustard, vincristine, procarbazine, prednisone, doxorubicin, bleomycin, vinblastine; MZL, marginal zone lymphoma; NHL, non-Hodgkin's lymphoma; ND, no data; P, primary therapy; PSRHL, prognostic score for relapsed Hodgkin's lymphoma reported by Josting et al. [6]; pt, point; PTCL, peripheral T-cell lymphoma; S, salvage therapy; sAAIPI, age-adjusted international prognostic index at second-line therapy reported by Hamlin et al. [5]; Stanford V, vinblastine, doxorubicin, vincristine, bleomycin, nitrogen mustard, etoposide, prednisone; TBI, total body irradiation; tDLBCL, diffuse large B-cell lymphoma transformed from indolent B-cell lymphoma; TM, thiotepa, melphalan; VIM, etoposide, ifosfamide, methotrexate; VP16-CDDP, etoposide, cisplatin; Z, yttrium-90 ibritumomab tiuxetan.

Concerning imaging techniques and technologies, included studies generally followed the Society of Nuclear Medicine guidelines (Appendix Table 2) [34, 35]. Three studies exclusively used a PET/CT scanner [23, 24, 27], and another study used a hybrid PET/CT in a subgroup of patients [30]. All studies but one [32] employed attenuation correction to reconstruct imaging.

Studies adopted various definitions of qualitative positive and negative diagnostic criteria (Appendix Table 3). One study [26] adopted diagnostic criteria proposed for post-therapy response assessment [8], and none used criteria proposed for interim PET assessment [36]. Five studies also defined positive or negative lesions using the standard uptake value [22, 24, 27, 29, 33]. In general, multiple nuclear medicine physicians interpreted PET results in each study. No study reported the level of between-observer agreement.

Quality Assessment of Published Studies

No study reported all 15 quality items that we assessed (online Appendix Table 4). Reporting was especially poor on blinding of assessors to clinical outcomes (typically treating physicians), to the results of pre–high-dose therapy PET, and to whether treatment strategies were altered based on the PET results. Although six studies [22, 24, 27, 28, 31, 32] employed blinding of PET interpreters to clinical information, only two studies [22, 31] avoided the alteration of treatment based on PET results. Detailed results of quality assessment can be found in online Appendix Table 4.

Sensitivity, Specificity, LRs, and Summary ROC Curves

Visual assessment revealed substantial between-study heterogeneity (Figs. 2 and 3). PET sensitivity was in the range of 0.32–1.0 and specificity was in the range of 0.48–1.0. Summary estimates were 0.69 (95% CI, 0.56–0.81) for sensitivity and 0.81 (95% CI, 0.73–0.87) for specificity, for a positive LR of 3.6 and a negative LR of 0.38. The Q* statistic for the summary ROC curve was 0.87.

Sensitivity and specificity of pre–high-dose therapy 18F-FDG PET for lymphoma. The size of the square plotting symbol is proportional to the sample size (the number of patients who progressed or relapsed for sensitivity and in remission for specificity) for each study. Horizontal lines are the 95% confidence intervals. Dashed vertical lines are the summary sensitivity and specificity.
Figure 2

Sensitivity and specificity of pre–high-dose therapy 18F-FDG PET for lymphoma. The size of the square plotting symbol is proportional to the sample size (the number of patients who progressed or relapsed for sensitivity and in remission for specificity) for each study. Horizontal lines are the 95% confidence intervals. Dashed vertical lines are the summary sensitivity and specificity.

Abbreviations: 18F-FDG PET, fluorine-18-fluorodeoxyglucose positron emission tomography.

ROC plotting and summary ROC curve of pre–high-dose therapy 18F-FDG PET for lymphoma. Individual study estimates of sensitivity and 1 − specificity are shown. The size of each circle is proportional to the sample size for each study (all study participants). The dashed crescent boundary represents the 95% confidence region for the summary sensitivity and specificity (shown as the square symbol).
Figure 3

ROC plotting and summary ROC curve of pre–high-dose therapy 18F-FDG PET for lymphoma. Individual study estimates of sensitivity and 1 − specificity are shown. The size of each circle is proportional to the sample size for each study (all study participants). The dashed crescent boundary represents the 95% confidence region for the summary sensitivity and specificity (shown as the square symbol).

Abbreviations: 18F-FDG PET, fluorine-18-fluorodeoxyglucose positron emission tomography; ROC, receiver operating characteristic.

Of seven studies that performed conventional restaging at the same time point as pre–high-dose therapy PET scans [22, 25, 26, 28, 3033], four studies reported direct comparisons of conventional restaging with PET [22, 25, 28, 32]. One study did not report whether supplemental tests such as bone marrow biopsy were performed in addition to CT. Interpreters of PET results were blinded to conventional restaging results in three studies [22, 28, 32], whereas no study explicitly reported the blinding of assessors of conventional restaging to PET results. The summary ROC curve for PET (Q* = 0.93) stayed consistently above the curve for conventional restaging (Q* = 0.59) over the range where data points for these four studies were plotted (Fig. 4).

Comparison between pre–high-dose therapy 18F-FDG PET and conventional restaging. Individual study estimates of sensitivity and 1 − specificity are shown with a circular (PET) or square (conventional restaging) symbol. A solid (PET) or dashed (conventional restaging) line represents the summary receiver operating characteristic curve.
Figure 4

Comparison between pre–high-dose therapy 18F-FDG PET and conventional restaging. Individual study estimates of sensitivity and 1 − specificity are shown with a circular (PET) or square (conventional restaging) symbol. A solid (PET) or dashed (conventional restaging) line represents the summary receiver operating characteristic curve.

Abbreviation: 18F-FDG PET, fluorine-18-fluorodeoxyglucose positron emission tomography.

Prognostic accuracy was relatively stable across subgroups (online Appendix Table 5 and Fig. 5). Summary sensitivity and specificity did not significantly change when the analyses were restricted to specific histologies (i.e., HL or DLBCL), studies of salvage therapy, or studies that did not adopt MRU as a separate response criterion. We further explored between-study heterogeneity by metaregression analysis for the predefined study-level covariates. None of the predictors significantly influenced the prognostic accuracy (all p-values > .1).

Subgroup and sensitivity analyses. Squares represent the summary sensitivity or specificity estimates for each analysis. Horizontal lines represent 95% confidence intervals. Dashed vertical lines indicate the summary sensitivity and specificity for the main analysis (top row).
Figure 5

Subgroup and sensitivity analyses. Squares represent the summary sensitivity or specificity estimates for each analysis. Horizontal lines represent 95% confidence intervals. Dashed vertical lines indicate the summary sensitivity and specificity for the main analysis (top row).

Abbreviations: DLBCL, diffuse large B-cell lymphoma; HL, Hodgkin's lymphoma; MRU, minimal residual uptake; NHL, non-Hodgkin's lymphoma.

Of three studies that had MRU with various definitions [22, 25, 31], only two reported such PET results in 20% and 46% of patients [25, 31]. In a sensitivity analysis, treating MRU as a “negative” result decreased the summary sensitivity to 0.63 (95% CI, 0.49–0.74) and increased the summary specificity to 0.85 (95% CI, 0.80–0.88) (Fig. 5). Eight studies reported early (<1 year) censorings, the proportion of which was in the range of 5%–17% of all cases [2225, 27, 29, 31, 33]. The summary sensitivity and specificity were not materially different in the “best-case” and “worst-case” scenarios, or when censored cases were excluded (Fig. 5).

PFS

Eleven studies allowed the calculation of a HR for PFS [2229, 3133]. A positive 18F-FDG PET scan was significantly associated with a shorter PFS interval (random effects HR, 4.3; 95% CI, 3.1–6.0; p < .0001) (online Appendix Fig. 1). There was low evidence of between-study heterogeneity (I2 = 14%).

Discussion

18F-FDG PET performed for patients with lymphoma before high-dose chemotherapy with stem cell transplantation has good accuracy for predicting progression or relapse in the first 2 years following the completion of therapy. The summary specificity estimated in the meta-analysis was nearly 80%, which was stable in sensitivity analyses, including a “worst-case” scenario biased against PET. Overall, patients with a positive pre–high-dose therapy PET scan appear to have a four- to fivefold higher risk for treatment failure than patients with a negative scan.

At present, the literature on pre–high-dose chemotherapy PET is clinically and methodologically heterogeneous. The significant statistical heterogeneity observed among the eligible studies could at least in part be attributable to the underlying clinical heterogeneity, because studies included patients with diverse histological subtypes in different treatment settings (i.e., first-line and second-line) and used many different regimens of salvage and high-dose chemotherapy. Other potential sources of heterogeneity may include temporal changes in treatment strategies (e.g., the introduction of rituximab-based chemoimmunotherapy), supportive care, imaging technologies (e.g., transition from stand-alone PET to PET/CT), and the diagnostic criteria for PET. We nevertheless attempted to synthesize these clinically heterogeneous studies to identify factors that may explain this heterogeneity. Although in subgroup and metaregression analyses these factors did not appear to explain the observed heterogeneity, the interpretation of these results should be treated with caution because of the low statistical power. Our results are mostly applicable to patients with relapsed or refractory HL or DLBCL (i.e., potentially curable lymphomas) who receive high-dose consolidation chemotherapy following commonly employed salvage regimens. We caution that our results may not be applicable to histological subtypes such as indolent lymphomas because the reported follow-up periods are too short to evaluate PFS for these histologies. In addition, PET assessment before high-dose therapy in the salvage setting is clinically less relevant for indolent lymphoma subtypes.

Interestingly, we found some evidence that 18F-FDG PET performed before consolidative high-dose chemotherapy may be a good replacement for CT-based conventional restaging for identifying patients most likely to fail in this invasive and costly treatment. In the head-to-head comparison between these two tests, 18F-FDG PET outperformed conventional restaging.

Our primary analysis, using a predictive accuracy framework, provides valuable information on error rates (i.e., FP and FN rates) and takes into account the variability in diagnostic thresholds. This approach has the benefit of generating clinically applicable information and facilitated the exploration of heterogeneity by subgroup and metaregression analyses. This approach is different from that used in a previously published review [37], which assessed only the HR. Furthermore, to account for censoring, we performed sensitivity analyses, and also used a time-to-event analysis to estimate a summary HR for PFS.

Several limitations need to be taken into account when interpreting our results. In view of the heterogeneity of chemotherapy regimens in the studies, we were not able to explore their effect on the prognostic accuracy of PET. This is particularly important for assessing the effect of rituximab-containing regimens, because rituximab is now considered part of standard therapy for most patients with B-cell lymphomas, and some studies suggest that the use of rituximab may increase the FP rate of PET [38, 39]. Also, we were not able to assess the incremental value of PET compared with conventional prognostic scores. Although several studies performed multivariate analyses to take such factors into account [2325, 2933], relevant data were typically not available from the publications. A patient-level meta-analysis with standardized outcomes and full information on covariates and censoring would be beneficial to achieve this goal [40].

Given the limitations of the existing literature, future studies are needed to confirm the promising estimates of the prognostic value of 18F-FDG PET. Prospective studies with larger sample sizes focusing on clinically and histologically more homogeneous populations (e.g., only relapsed DLBCL patients after rituximab-containing first-line therapies) using standardized PET protocols and interpretation criteria are needed. Better standardization of diagnostic criteria with the involvement of well-trained assessors is particularly important given that inter-reader variability appears to be substantial, even among experts using the same criteria [41]. To facilitate this goal, international collaborations of investigators, such as the International Workshop on Interim-PET Scan in Lymphoma, could provide guidance regarding the technical implementation of PET, the adoption of uniform response assessment criteria, and the availability of individual patient data to facilitate future evidence synthesis [36].

Conclusion

18F-FDG PET appears to be an appropriate test for the prediction of treatment outcomes in patients with refractory or relapsed lymphoma who receive high-dose chemotherapy after induction salvage chemotherapy. Given the methodological limitations of primary studies, prospective studies with standardized methodologies are needed to confirm and refine these promising results.

Acknowledgments

We thank Dr. Christopher Schmid for guidance in performing statistical analyses.

This study was supported in part by grant UL1RR025752 from the National Center for Research Resources to Tufts-Clinical Translational Science Institute (T.T. and I.J.D.), Banyu Life Science Foundation International (H19) to T.T., a research scholarship from the “Maria P. Lemos” Foundation to I.J.D., and a grant from the Ministry of Education, Culture, Sports, Science and Technology of Japan (No. 21791183) to T.N.

Author Contributions

Conception/Design: Teruhiko Terasawa, Issa J. Dahabreh, Takashi Nihashi

Collection and/or assembly of data: Teruhiko Terasawa, Issa J. Dahabreh, Takashi Nihashi

Data analysis and interpretation: Teruhiko Terasawa, Issa J. Dahabreh, Takashi Nihashi

Manuscript writing: Teruhiko Terasawa, Issa J. Dahabreh, Takashi Nihashi

Final approval of manuscript: Teruhiko Terasawa, Issa J. Dahabreh, Takashi Nihashi

References

1

Connors
 
JM
 
State-of-the-art therapeutics: Hodgkin's lymphoma
 
J Clin Oncol
 
2005
;
23
:
6400
6408

2

Coiffier
 
B
 
State-of-the-art therapeutics: Diffuse large B-cell lymphoma
 
J Clin Oncol
 
2005
;
23
:
6387
6393

3

Mendler
 
JH
,
Friedberg
 
JW
 
Salvage therapy in Hodgkin's lymphoma
 
The Oncologist
 
2009
;
14
:
425
432

4

Sud
 
R
,
Friedberg
 
JW
 
Salvage therapy for relapsed or refractory diffuse large B-cell lymphoma: Impact of prior rituximab
 
Haematologica
 
2008
;
93
:
1776
1780

5

Hamlin
 
PA
,
Zelenetz
 
AD
,
Kewalramani
 
T
et al.
Age-adjusted International Prognostic Index predicts autologous stem cell transplantation outcome for patients with relapsed or primary refractory diffuse large B-cell lymphoma
 
Blood
 
2003
;
102
:
1989
1996

6

Josting
 
A
,
Franklin
 
J
,
May
 
M
et al.
New prognostic score based on treatment outcome of patients with relapsed Hodgkin's lymphoma registered in the database of the German Hodgkin's lymphoma study group
 
J Clin Oncol
 
2002
;
20
:
221
230

7

Terasawa
 
T
,
Nihashi
 
T
,
Hotta
 
T
et al.
18F-FDG PET for posttherapy assessment of Hodgkin's disease and aggressive Non-Hodgkin's lymphoma: A systematic review
 
J Nucl Med
 
2008
;
49
:
13
21

8

Juweid
 
ME
,
Stroobants
 
S
,
Hoekstra
 
OS
et al.
Use of positron emission tomography for response assessment of lymphoma: Consensus of the Imaging Subcommittee of International Harmonization Project in Lymphoma
 
J Clin Oncol
 
2007
;
25
:
571
578

9

Cheson
 
BD
,
Pfistner
 
B
,
Juweid
 
ME
et al.
Revised response criteria for malignant lymphoma
 
J Clin Oncol
 
2007
;
25
:
579
586

10

Isasi
 
CR
,
Lu
 
P
,
Blaufox
 
MD
 
A metaanalysis of 18F-2-deoxy-2-fluoro-D-glucose positron emission tomography in the staging and restaging of patients with lymphoma
 
Cancer
 
2005
;
104
:
1066
1074

11

Terasawa
 
T
,
Lau
 
J
,
Bardet
 
S
et al.
Fluorine-18-fluorodeoxyglucose positron emission tomography for interim response assessment of advanced-stage Hodgkin's lymphoma and diffuse large B-cell lymphoma: A systematic review
 
J Clin Oncol
 
2009
;
27
:
1906
1914

12

Kasamon
 
YL
,
Wahl
 
RL
 
FDG PET and risk-adapted therapy in Hodgkin's and non-Hodgkin's lymphoma
 
Curr Opin Oncol
 
2008
;
20
:
206
219

13

Whiting
 
P
,
Rutjes
 
AW
,
Reitsma
 
JB
et al.
The development of QUADAS: A tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews
 
BMC Med Res Methodol
 
2003
;
3
:
25

14

McShane
 
LM
,
Altman
 
DG
,
Sauerbrei
 
W
et al.
Reporting recommendations for tumor marker prognostic studies (REMARK)
 
J Natl Cancer Inst
 
2005
;
97
:
1180
1184

15

Reitsma
 
JB
,
Glas
 
AS
,
Rutjes
 
AW
et al.
Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews
 
J Clin Epidemiol
 
2005
;
58
:
982
990

16

Macaskill
 
P
 
Empirical Bayes estimates generated in a hierarchical summary ROC analysis agreed closely with those of a full Bayesian analysis
 
J Clin Epidemiol
 
2004
;
57
:
925
932

17

Harbord
 
RM
,
Deeks
 
JJ
,
Egger
 
M
et al.
A unification of models for meta-analysis of diagnostic accuracy studies
 
Biostatistics
 
2007
;
8
:
239
251

18

Zwinderman
 
AH
,
Bossuyt
 
PM
 
We should not pool diagnostic likelihood ratios in systematic reviews
 
Stat Med
 
2008
;
27
:
687
697

19

Parmar
 
MK
,
Torri
 
V
,
Stewart
 
L
 
Extracting summary statistics to perform meta-analyses of the published literature for survival endpoints
 
Stat Med
 
1998
;
17
:
2815
2834

20

DerSimonian
 
R
,
Laird
 
N
 
Meta-analysis in clinical trials
 
Control Clin Trials
 
1986
;
7
:
177
188

21

Higgins
 
JP
,
Thompson
 
SG
,
Deeks
 
JJ
et al.
Measuring inconsistency in meta-analyses
 
BMJ
 
2003
;
327
:
557
560

22

Cremerius
 
U
,
Fabry
 
U
,
Wildberger
 
JE
et al.
Pre-transplant positron emission tomography (PET) using fluorine-18-fluoro-deoxyglucose (FDG) predicts outcome in patients treated with high-dose chemotherapy and autologous stem cell transplantation for non-Hodgkin's lymphoma
 
Bone Marrow Transplant
 
2002
;
30
:
103
111

23

Crocchiolo
 
R
,
Canevari
 
C
,
Assanelli
 
A
et al.
Pre-transplant 18FDG-PET predicts outcome in lymphoma patients treated with high-dose sequential chemotherapy followed by autologous stem cell transplantation
 
Leuk Lymphoma
 
2008
;
49
:
727
733

24

Derenzini
 
E
,
Musuraca
 
G
,
Fanti
 
S
et al.
Pretransplantation positron emission tomography scan is the main predictor of autologous stem cell transplantation outcome in aggressive B-cell non-Hodgkin lymphoma
 
Cancer
 
2008
;
113
:
2496
2503

25

Becherer
 
A
,
Mitterbauer
 
M
,
Jaeger
 
U
et al.
Positron emission tomography with [18F]2-fluoro-D-2-deoxyglucose (FDG-PET) predicts relapse of malignant lymphoma after high-dose therapy with stem cell transplantation
 
Leukemia
 
2002
;
16
:
260
267

26

Castagna
 
L
,
Bramanti
 
S
,
Balzarotti
 
M
et al.
Predictive value of early 18F-fluorodeoxyglucose positron emission tomography (FDG-PET) during salvage chemotherapy in relapsing/refractory Hodgkin lymphoma (HL) treated with high-dose chemotherapy
 
Br J Haematol
 
2009
;
145
:
369
372

27

Filmont
 
JE
,
Gisselbrecht
 
C
,
Cuenca
 
X
et al.
The impact of pre- and post-transplantation positron emission tomography using 18-fluorodeoxyglucose on poor-prognosis lymphoma patients undergoing autologous stem cell transplantation
 
Cancer
 
2007
;
110
:
1361
1369

28

Filmont
 
JE
,
Czernin
 
J
,
Yap
 
C
et al.
Value of F-18 fluorodeoxyglucose positron emission tomography for predicting the clinical outcome of patients with aggressive lymphoma prior to and after autologous stem-cell transplantation
 
Chest
 
2003
;
124
:
608
613

29

Hoppe
 
BS
,
Moskowitz
 
CH
,
Zhang
 
Z
et al.
The role of FDG-PET imaging and involved field radiotherapy in relapsed or refractory diffuse large B-cell lymphoma
 
Bone Marrow Transplant
 
2009
;
43
:
941
948

30

Jabbour
 
E
,
Hosing
 
C
,
Ayers
 
G
et al.
Pretransplant positive positron emission tomography/gallium scans predict poor outcome in patients with recurrent/refractory Hodgkin lymphoma
 
Cancer
 
2007
;
109
:
2481
2489

31

Schot
 
BW
,
Zijlstra
 
JM
,
Sluiter
 
WJ
et al.
Early FDG-PET assessment in combination with clinical risk scores determines prognosis in recurring lymphoma
 
Blood
 
2007
;
109
:
486
491

32

Spaepen
 
K
,
Stroobants
 
S
,
Dupont
 
P
et al.
Prognostic value of pretransplantation positron emission tomography using fluorine 18-fluorodeoxyglucose in patients with aggressive lymphoma treated with high-dose chemotherapy and stem cell transplantation
 
Blood
 
2003
;
102
:
53
59

33

Svoboda
 
J
,
Andreadis
 
C
,
Elstrom
 
R
et al.
Prognostic value of FDG-PET scan imaging in lymphoma patients undergoing autologous stem cell transplantation
 
Bone Marrow Transplant
 
2006
;
38
:
211
216

34

Schelbert
 
HR
,
Hoh
 
CK
,
Royal
 
HD
et al.
Society of Nuclear Medicine Procedure Guideline for Tumor Imaging Using F-18 FDG, Version 2.0.
.
Reston, VA
:
Society of Nuclear Medicine
,
1999
:
1
6
. Available online at http://www.snm.org, accessed June 21, 2010.

35

Delbeke
 
D
,
Coleman
 
RE
,
Guiberteau
 
MJ
et al.
Procedure Guideline for Tumor Imaging with 18F-FDG PET/CT 1.0.
.
Reston, VA
:
Society of Nuclear Medicine
,
2006
:
1
11
. Available online at http://www.snm.org, accessed June 21, 2010.

36

Meignan
 
M
,
Gallamini
 
A
,
Haioun
 
C
 
Report on the first international workshop on interim-PET scan in lymphoma
 
Leuk Lymphoma
 
2009
;
50
:
1257
1260

37

Poulou
 
LS
,
Thanos
 
L
,
Ziakas
 
PD
 
Unifying the predictive value of pretransplant FDG PET in patients with lymphoma: A review and meta-analysis of published trials
 
Eur J Nucl Med Mol Imaging
 
2010
;
37
:
156
162

38

Han
 
HS
,
Escalon
 
MP
,
Hsiao
 
B
et al.
High incidence of false-positive PET scans in patients with aggressive non-Hodgkin's lymphoma treated with rituximab-containing regimens
 
Ann Oncol
 
2009
;
20
:
309
318

39

Moskowitz
 
C
,
Hamlin
 
P
,
Horwitz
 
SM
et al.
Phase II trial of dose-dense R-CHOP followed by risk-adapted consolidation with either ICE or ICE and ASCT, based upon the results of biopsy confirmed abnormal interim restaging PET scan, improves outcome in patients with advanced stage DLBCL [abstract]
 
Blood
 
2006
;
108
(
suppl
):
532

40

Riley
 
RD
,
Sauerbrei
 
W
,
Altman
 
DG
 
Prognostic markers in cancer: The evolution of evidence from single studies to meta-analysis, and beyond
 
Br J Cancer
 
2009
;
100
:
1219
1229

41

Horning
 
SJ
,
Juweid
 
ME
,
Schöder
 
H
et al.
Interim positron emission tomography scans in diffuse large B-cell lymphoma: An independent expert nuclear medicine evaluation of the Eastern Cooperative Oncology Group E3404 study
 
Blood
 
2010
;
115
:
775
777

42

Lister
 
TA
,
Crowther
 
D
,
Sutcliffe
 
SB
et al.
Report of a committee convened to discuss the evaluation and staging of patients with Hodgkin's disease: Cotswolds meeting
 
J Clin Oncol
 
1989
;
7
:
1630
1636

43

Cheson
 
BD
,
Horning
 
SJ
,
Coiffier
 
B
et al.
Report of an international workshop to standardize response criteria for non-Hodgkin's lymphomas: NCI Sponsored International Working Group
 
J Clin Oncol
 
1999
;
17
:
1244
1253

44

Mikhaeel
 
G
,
Hutchings
 
M
,
Fields
 
PA
et al.
FDG-PET after two to three cycles of chemotherapy predicts progression-free and overall survival in high-grade non-Hodgkin lymphoma
 
Ann Oncol
 
2005
;
16
:
1514
1523

45

Juweid
 
ME
,
Wiseman
 
GA
,
Vose
 
JM
et al.
Response assessment of aggressive non-Hodgkin's lymphoma by integrated International Workshop Criteria and fluorine-18-fluorodeoxyglucose positron emission tomography
 
J Clin Oncol
 
2005
;
23
:
4652
4661

Author notes

Disclosures: Teruhiko Terasawa: None; Issa J. Dahabreh: None; Takashi Nihashi: None.

The content of this article has been reviewed by independent peer reviewers to ensure that it is balanced, objective, and free from commercial bias. No financial relationships relevant to the content of this article have been disclosed by the authors or independent peer reviewers.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data