Abstract

Composite lymphoma (CL) is rare. We conducted an analysis of 53 329 cases of diffuse large B-cell lymphoma (DLBCL), 17,916 cases of Hodgkin lymphoma (HL), and 869 cases of composite HL and DLBCL from the SEER database diagnosed between 2000 and 2019. Incidence rates showed increasing trends with age for CL and DLBCL, while HL exhibited 2 peak incidence rates: 42.05 (95% CI: 40.88-43.25) per million for the age group 20-24 and 43.20 (95% CI: 41.13-45.35) per million for 75-79. Higher incidence rates were observed in males (CL, 0.68, 95% CI: 0.62-0.74; HL, 29.65, 95% CI: 29.27-30.03; DLBCL, 86.18, 95% CI: 85.51-86.86) compared to females (CL, 0.40, 95% CI: 0.36-0.45; HL, 23.15, 95% CI: 22.83-23.49; DLBCL, 57.56, 95% CI: 57.06-58.06; P < .001). We first identified independent prognostic factors for composite HL and DLBCL, which were used for development of a scoring nomogram. Factors such as primary tumor site, marital status, chemotherapy, and sex predominantly influence short-term survival, while Ann Arbor stage plays a significant role in long-term survival. Furthermore, there were notable differences in demographic characteristics, survival outcomes, and death cause among CL, HL, and DLBCL. This study provides the first comprehensive report of composite HL and DLBCL.

Implications for practice

This study provides the first comprehensive report of composite HL and DLBCL. There were notable differences in demographic characteristics, survival outcomes, and death cause among CL, HL, and DLBCL. We identified independent prognostic factors for composite HL and DLBCL and developed a scoring nomogram. The findings hold significant potential to inform clinicians in making well-informed clinical decisions, leading to enhanced patient outcomes within this rare and intricate subset of lymphomas.

Introduction

Composite lymphoma (CL) represents a rare phenomenon characterized by the concurrent presence of 2 distinct lymphoma subtypes within the same anatomical site.1 Specifically, CL with Hodgkin lymphoma (HL) and diffuse large B-cell lymphoma (DLBCL) components poses a unique challenge in the field of hematological malignancies.2 These cases often exhibit atypical features, underscoring the importance of accurate diagnosis and appropriate therapeutic strategies.3 Furthermore, CL cases involving HL and DLBCL may demonstrate distinct clinical behaviors and treatment responses compared to HL or DLBCL alone. Consequently, a comprehensive understanding of the epidemiological and clinical characteristics of composite HL and DLBCL is vital to optimize patient management and treatment outcomes.

To date, research focusing on CL with HL and DLBCL remains scarce, with the majority of studies being case reports (Supplementary Table S1). The concept of CL was initially proposed by Custer and elucidated by Hicks.4 It is important to note that CL with HL and DLBCL is characterized by clear and distinct evidence of both HL and DLBCL components within the same case. While related, CL is related but not entirely synonymous with gray zone lymphomas, which represent a category of lymphomas displaying overlapping features of HL and DLBCL. Subsequently, Kim refined the understanding of this phenomenon.5 Remarkable cases of CL have been reported, such as the presence of different components derived from the same clone within a single tumor mass, exhibiting different morphological, immunophenotypic, and Epstein-Barr virus characteristics of both HL and DLBCL in a 56-year-old male patient, as reported by Huang et al.6 Goyal et al also found that in the majority of cases, the DLBCL, not otherwise specified, and HL components in CL are clonally related, indicating a shared origin from a common B-cell precursor.2 Furthermore, Tao et al analyzed 25 cases of CL and sequential lymphoma between primary mediastinal lymphoma/diffuse large B-cell lymphoma (LBCL) and HL, revealing significantly poorer outcomes in patients with LBCL-HL compared to de novo HL.7 However, the current literature on CL with HL and DLBCL is limited and incomplete, necessitating a thorough investigation of its incidence, prevalence, mortality, and prognosis.

Therefore, this study represents the largest investigation to date that aims to comprehensively explore the epidemiological and clinical outcomes of patients with composite HL and DLBCL. The anticipated findings hold significant potential to inform clinicians in making well-informed clinical decisions, leading to enhanced patient outcomes within this rare and intricate subset of lymphomas.

Methods

Data source

Patients’ data were obtained from cancer records encompassing 9 states in the US. Specifically, data from the years 2000 to 2019 were included in the analysis. The cancer data utilized in this research were derived from the Surveillance, Epidemiology, and End Results (SEER) database, which was developed by the National Cancer Institute (https://seer.cancer.gov) to address the increasing burden of cancer. The included states in this study were Connecticut, Detroit, Atlanta, San Francisco-Oakland, Hawaii, Iowa, New Mexico, Seattle-Puget Sound, and Utah. The SEER database provides a comprehensive compilation of cancer-related statistics, capturing key epidemiological aspects such as incidence, prevalence, and mortality. Additionally, it includes detailed clinical characteristics such as cancer demographics, initial treatment courses, and follow-up information regarding survival status and recorded instances of death events. Notably, the SEER database covers approximately one-third of the entire US cancer population, ensuring a robust representation of cancer-related information.

Patient selection

Patients diagnosed with composite HL and DLBCL, as well as those with HL or DLBCL alone, were identified using the 3rd edition of the International Classification of Diseases for Oncology (ICD-O-3). To determine the composite diagnosis, cases were identified where HL and DLBCL were diagnosed simultaneously or within a short timeframe. Comprehensive demographic and clinical data of the patients were collected, including age at diagnosis, sex, year of diagnosis, race, ethnicity, marital status, median household income, residential area, diagnostic-to-treatment delay, primary site, B symptoms, Ann Arbor stage, SEER stage, surgery, chemotherapy, and radiotherapy.

Statistical analysis

The patients’ data extraction, including clinical characteristics and follow-up information, as well as the calculation of epidemiologic rates were carried out using SEER*Stat version 8.4.1 software (https://seer.cancer.gov/seerstat). The statistical analyses were conducted using IBM SPSS version 26.0 (Armonk) and R software version 4.1.2 (https://www.r-project.org).

The age-adjusted rates (AAR) with 95% confidence intervals (CI) were calculated using the 2000 US standard population as a reference. Overall survival (OS) time was calculated from the date of initial diagnosis until the date of last follow-up or death from any cause. The OS-associated factors were performed using univariable and multivariable Cox proportional hazards regression analyses, to calculate hazard ratios (HR) and corresponding 95% CI. Subsequently, a nomogram model was developed based on these factors to predict 1-year, 5-year, and 10-year OS probabilities. To ensure the robustness of the nomogram, internal validation was conducted in the training cohort, while external validation was performed in the validation cohort, which was randomly divided in a 7:3 ratio. Baseline characteristics between the training and validation cohorts were compared using the χ2 test. The accuracy of the model was evaluated using the area under the receiver operating characteristic curve (AUC), and the calibration curve was plotted to compare the predicted outcomes of the nomogram with actual survival.

To predict the long-term risk of cumulative lymphoma-specific mortalities, 6 classical machine learning (ML) algorithms were employed, including extreme gradient boosting (XGB), random forest classifier (RFC), adaptive boosting (ADB), K nearest neighbor (KNN), artificial neural network, and gradient boosting decision tree. A comprehensive approach was taken to select the optimal combination of variables for each algorithm. Initially, over a dozen variables were individually evaluated by running them through the models to assess their performance and predictive capability, measured by the AUC of ROCs, and decision curve analysis (DCA) was conducted. The best-performing variables were identified and subsequently combined with additional variables, with the process repeated until the optimal combination yielding the best overall results was determined.

Results

Incidence rate

Among the identified patients, there were 869 cases of composite HL and DLBCL, 53,329 cases of DLBCL, and 17 916 cases of HL. The flow diagram outlining the study is presented in Figure 1. The incidence rates are shown in Figure 2. The incidence rate of CL demonstrated an increasing trend with age, reaching 1.18 (95% CI: 0.95-1.44) per million for the age group 60-64, 2.83 (95% CI: 2.32-3.42) per million for 75-79, and 3.29 (95% CI: 2.64-4.05) per million for individuals aged 85 and above. HL exhibited 2 peak incidence rates: 42.05 (95% CI: 40.88-43.25) per million for the age group 20-24 and 43.20 (95% CI: 41.13-45.35) per million for 75-79. The incidence rate of DLBCL was higher and showed an increasing trend with age, with 454.15 (95% CI: 446.24-462.16) per million for the age group 80-84. Regarding patients with CL, the AARs for Non-Spanish Hispanic Latino and Spanish Hispanic Latino were 0.54 (95% CI: 0.50-0.58) and 0.48 (95% CI: 0.39-0.59), respectively (P = 0.342). HL had a higher incidence rate among Non-Spanish Hispanic Latinos (P < 0.001), while DLBCL had a higher incidence rate among Spanish Hispanic Latinos (P < 0.001). In terms of race, the overall AARs for White were the highest among the 3 lymphoma subtypes (CL, 0.56, 95% CI: 0.52-0.60; HL, 28.11, 95% CI: 27.82-28.41; DLBCL, 73.47, 95% CI: 73.01-73.94), and the lowest were observed among American Indian/Alaska Native (AI/AN) individuals (CL, 0.19, 95% CI: 0.03-0.55; HL, 9.66, 95% CI: 8.32-11.15; DLBCL, 40.20, 95% CI: 37.01-43.57). Males exhibited higher overall AARs (CL, 0.68, 95% CI: 0.62-0.74; HL, 29.65, 95% CI: 29.27-30.03; DLBCL, 86.18, 95% CI: 85.51-86.86) compared to females (CL, 0.40, 95% CI: 0.36-0.45; HL, 23.15, 95% CI: 22.83-23.49; DLBCL, 57.56, 95% CI: 57.06-58.06; P < .001).

A diagram illustrating the sequential steps for the analysis of composite HL and DLBCL, including data collection, subgroup analysis, and model validation.
Figure 1.

Flowchart for comprehensive analysis of composite HL and DLBCL.

A chart and confidence intervals representing incidence rates of HL, DLBCL, and composite HL and DLBCL across various demographic factors like age, ethnicity, and sex.
Figure 2.

Incidence rates (per million) with 95% CI of HL, DLBCL, and composite HL and DLBCL by age at diagnosis, ethnicity, race, and sex.

Patient characteristics

The study identified patients diagnosed with CL, HL, and DLBCL between 2000 and 2019 from the SEER database. Univariate analysis and multivariate logistic regression were conducted to examine the differences between patients with HL for each variable (Supplementary Table S2). The analysis revealed several independent prognostic factors, including age, race, marital status, median income, primary site, and the use of radiotherapy and chemotherapy. Patients aged 65 years or older at the time of diagnosis exhibited a higher risk of death compared to younger patients (HR = 3.64, 95% CI: 2.40-5.52, P < .001). Marital status was also found to be a significant factor, with unmarried individuals demonstrating a survival disadvantage (HR = 1.33, 95% CI: 1.06-1.67, P = .014). Moreover, patients with higher median income levels experienced a significant reduction in mortality risk (median income = $50 000~$70 000, HR = 0.70, 95% CI: 0.53-0.93, P = .013; median income > $70 000, HR = 0.59, 95% CI: 0.44-0.80, P < .001). The primary site of involvement played a crucial role in describing CL. Patients with involvement of multiple regions had a higher hazard of death compared to those with involvement of a single region (HR = 1.44, 95% CI: 1.10-1.87, P = .007). Furthermore, the use of radiotherapy or chemotherapy was associated with a lower risk of mortality (radiotherapy, HR = 0.54, 95% CI: 0.36-0.82, P = .004; chemotherapy, HR = 0.62, 95% CI: 0.49-0.79, P < .001). However, no significant difference in OS was observed between patients who underwent surgery and those who did not.

Survival prediction

A total of 869 patients diagnosed with HL and DLBCL were randomly allocated into a training cohort and a validation cohort in a 7:3 ratio. The demographic characteristics of these 2 cohorts were comparable and showed no significant differences (Supplementary Table S3). Using a multiple Cox regression model, independent prognostic factors were identified, and based on these factors, a prediction nomogram was constructed in the training cohort. The variables included in the nomogram encompassed age, race, primary site, radiotherapy, chemotherapy, marital status, and median income (Figure 3A). To evaluate the performance and reliability of the nomogram, both internal and external validations were conducted. Calibration curves were plotted to assess the agreement between the predicted outcomes from the nomogram and the actual OS at 1-year, 5-year, and 10-year time points. Remarkably, the calibration curves demonstrated excellent consistency in both the training and validation cohorts, indicating accurate prediction (Figure 4). Furthermore, a visual representation was created to illustrate the relationship between the nomogram scores assigned to all patients and their corresponding survival time (Figure 3B). Notably, higher nomogram scores were associated with significantly worse survival outcomes (Figure 3C).

A graphical nomogram with scales to estimate survival probabilities at 1, 5, and 10 years for patients with composite HL and DLBCL based on multiple clinical variables.
Figure 3.

The nomogram to predict 1-year, 5-year, and 10-year survival probabilities among patients with composite HL and DLBCL. A. Quantitative nomogram to predict survival probabilities according to the total points based on age, race, primary site, radiotherapy, chemotherapy, marital status, and median income; B. Relationship between nomogram scores and survival time of each CL patient in the training cohort and validation cohort, respectively; C. Kaplan-Meier survival curves for CL patients grouped by the median nomogram score in the training cohort and validation cohort, respectively.

Calibration plots comparing predicted versus observed survival probabilities for 1, 5, and 10 years in both training and validation datasets.
Figure 4.

Calibration curves of the nomogram for 1-year, 5-year, and 10-year overall survival in the training cohort (A) and the validation cohort (B).

Survival analysis

The survival curves and comparisons related to OS were analyzed among different subgroups based on age, primary site, radiotherapy, chemotherapy, marital status, and median income. Notably, age emerged as a significant factor influencing OS among patients with CL. Specifically, patients aged 65 years and above exhibited poorer survival outcomes compared to their younger counterparts (P < .001) (Figure 5A). While patients with involvement of multiple regions had lower survival rates compared to those with single region involvement, the difference was not statistically significant (P = .34) (Figure 5B). The use of radiotherapy or chemotherapy demonstrated evident survival benefits (radiotherapy, P < .001; chemotherapy, P < .001) (Figure 5C, 5D). Interestingly, our analysis revealed a consistent advantage in OS for married individuals compared to those who were single (P = .015) (Figure 5E). Furthermore, a positive correlation between median income and survival was observed, with the highest survival benefit observed in the median income > $70 000 group (P < .001) (Figure 5F).

Kaplan-Meier survival curves showing differences in overall survival among various subgroups of patients with composite HL and DLBCL.
Figure 5.

Kaplan-Meier estimate of overall survival by subgroup analysis. A. Age; B. Primary site; C. Radiotherapy; D. Chemotherapy; E. Martial status; F. Median income.

Machine learning-based lymphoma-specific death risk prediction

The ML process was presented in Figure 6A. These ML models exhibited exceptional performance, as evidenced by high values of the AUCs, showcasing the superiority over artificial intelligence in prognostic prediction (Figure 6B). The DCA curves was shown in Figure 6C. Furthermore, we evaluated the sensitivity and specificity of each ML model using the maximal Youden index, which represents an optimal balance between true positives and true negatives (Supplementary Table S4). Through 5-fold cross-validation, the XGB, RFC, and ADB models demonstrated superior performance. To gain deeper insights into the association between demographic characteristics and long-term outcomes in patients with CL, we employed these ML algorithms to develop predictive models for assessing the 1-year, 5-year, and 10-year risk of cumulative lymphoma-specific mortalities based on the aforementioned variables. Thus, the contribution of each variable was calculated (Figure 7A). Importantly, we identified the variables associated with the risk of cumulative lymphoma-specific mortalities at different time intervals (Figure 7B). Primary tumor site, marital status, chemotherapy, and sex were found to primarily influence the 1-year OS, whereas their impact on the 5-year and 10-year OS rates was relatively minor. On the other hand, Ann Arbor stage played a significant role in contributing to the 5-year and 10-year OS rates, while demonstrating minimal effect on the 1-year OS rate. B-symptom presence and radiotherapy exhibited a substantial influence on the 10-year OS rate. Age, SEER stage, and median income were found to be associated with OS rates across all the time intervals.

Graphical outputs from machine learning models predicting long-term lymphoma-specific mortality rates in patients with composite HL and DLBCL.
Figure 6.

Machine learning models for risk prediction of long-term cumulative lymphoma-specific mortalities in patients with composite HL and DLBCL. A. Flowchart of machine learning development process; B. Receiver operating characteristic curves for all models; C. Decision curve analysis for 6 classical machine learning-based models.

Plots showing the contribution of clinical features to subgroup-specific and cumulative lymphoma-specific mortalities.
Figure 7.

Feature contribution in subgroups (A and B) and cumulative lymphoma-specific mortalities in subgroup analysis by age-adjusted competing-risk analysis (C-P).

Age-adjusted competing-risk analysis

To further elucidate the cumulative incidence associated with each variable, we conducted competing-risk analysis (Figure 7C-7P). Our findings revealed that patients with multiple regions involved demonstrated a significantly elevated risk of cumulative lymphoma-specific mortality (HR = 1.51, 95% CI: 1.10–2.07, P = 0.012). Notably, patients with distant SEER stages exhibited a substantial increase in the risk of cumulative lymphoma-specific mortality (HR = 1.23, 95% CI: 1.02-1.47, P = .027). Furthermore, our analysis revealed that radiotherapy was associated with a reduced incidence of cumulative lymphoma-specific mortality (HR = 0.53, 95% CI: 0.32–0.88, P = .015). However, no significant difference was observed in the incidence of cumulative lymphoma-specific mortality between patients who received chemotherapy and those who underwent surgery (chemotherapy, HR = 1.01, 95%, CI: 0.76-1.36, P = .92; surgery, HR = 0.97, 95% CI: 0.69-1.36, P = .85).

Comparison of CL to HL and DLBCL

Supplementary Table S5 presents the demographic and clinical characteristics of patients with CL, HL, and DLBCL, demonstrating significant differences among these lymphoma subtypes. The incidence of CL and DLBCL increases with age, whereas HL is predominantly observed in individuals under the age of 44 (56.5%), with a decreasing incidence as age increases (P < 0.001). All 3 lymphomas exhibit a slightly higher prevalence in males compared to females. Notably, CL diagnoses have shown a recent trend, with a limited number of cases before 2010, while HL and DLBCL, being more common lymphomas, show a more uniform distribution across different years (P < .001). The primary tumor site for CL is predominantly single region involved (50.2%), whereas HL mainly presents with involvement of multiple regions (57.1%) (P < .001). Over half of the CL cases have an unknown Ann Arbor stage (50.4%), while HL and DLBCL cases are more likely to have staging information available (P < .001). Additionally, the proportion of patients receiving surgical, radiotherapy, and chemotherapy treatments is significantly lower in CL compared to HL and DLBCL (P < .001).

Survival analysis demonstrates distinct outcomes among different subtypes of lymphoma (Figure 8A). To ensure a fair comparison, case-control matching was performed based on 8 baseline characteristics, including age, sex, year of diagnosis, race, ethnicity, marital status, median income, and Ann Arbor stage. The matching process resulted in 829 cases being completely matched in the DLBCL population to CL and 744 cases completely matched in the HL population to CL. Prior to case-control matching, HL exhibits the highest OS, followed by CL (P < .001), while DLBCL has the lowest OS (P < .001). After matching CL and DLBCL, short-term survival is lower in CL compared to DLBCL, but long-term survival is higher in CL than DLBCL (P = .034). When CL and HL are matched, there is no statistically significant difference in OS (P = .99). Furthermore, the incidence of cumulative lymphoma-specific mortalities is lower in CL compared to DLBCL (HR = 0.81, 95% CI: 0.70-0.93, P = .002), while no significant difference is observed between CL and HL in terms of cumulative lymphoma-specific mortalities (HR = 1.12, 95% CI: 0.96-1.31, P = 014) (Figure 8B). The majority of deaths among DLBCL and CL patients were attributed to “non-Hodgkin lymphoma” (DLBCL: 58.7%, CL: 52.9%), particularly within the first year (DLBCL: 71.9%, CL: 63.6%) and 1-5 years (DLBCL: 56.5%, CL: 37.4%). However, the proportion notably decreased in the 5-10 years (DLBCL: 26.6%, CL: 28.6%) and beyond 10 years (DLBCL: 16.0%, CL: 0%). “Hodgkin lymphoma” was the primary cause of death among HL patients (37.0%). Moreover, patients who survived beyond 5 years showed a higher susceptibility to developing heart diseases, cerebrovascular diseases, chronic obstructive pulmonary disease, and lung and bronchus diseases (Figure 8C). Additionally, the 1-year, 3-year, and 5-year overall survival probability remained relatively stable over time for all 3 lymphoma subtypes. However, the 1-year, 3-year, and 5-year lymphoma-specific survival probability showed an increasing trend, indicating ongoing advancements in research and treatment for these patient population (Figure 8D).

Comparative analysis of key clinical and demographic features among patients diagnosed with composite lymphoma, HL, and DLBCL.
Figure 8.

Comparison of patients with CL, HL, and DLBCL. A. Kaplan-Meier estimate of OS in patients with CL, HL, and DLBCL by case-control matching; B. Cumulative lymphoma-specific mortalities as survival years after lymphoma diagnosis by age-adjusted competing-risk analysis; C. Causes of death distribution in patients with CL, HL, and DLBCL; D, 1-year, 5-year, and 10-year survival probability of CL, HL, and DLBCL over the year of diagnosis.

Discussion

In this population-based study, we first and comprehensively conducted an analysis of patients with composite HL and DLBCL, utilizing the largest sample of cancer patients obtained from the SEER database. We focus on CL, a notably rare and prognostically challenging condition, which has predominantly been the subject of case reports due to its rarity. Our aim was to investigate the epidemiological and clinical characteristics of these patients and develop long-term outcome prediction models using ML techniques. Our study on CL, though rare, holds crucial clinical and research significance. It refines diagnosis, enhances risk assessment, and guides personalized treatment strategies, ultimately improving patient outcomes.

First, our study first provides an exhaustive epidemiological and clinical profile of CLs, particularly focusing on HL and DLBCL. This baseline is crucial for future research design, particularly in highlighting areas that need further investigation such as demographic factors, survival outcomes, and unique prognostic markers. It sheds light on the incidence rates and demographic disparities associated with HL, DLBCL, and CL. The analysis reveals an increasing trend of CL and DLBCL incidence with age, particularly in the elderly population. This finding underscores the need for heightened awareness and targeted screening strategies for older individuals. Moreover, the identification of independent prognostic factors for composite HL and DLBCL. By leveraging the extensive SEER database, the analysis reveals several factors that significantly impact patient outcomes, including age, race, marital status, median income, primary site involvement, and the use of radiotherapy and chemotherapy. Additionally, married individuals and those with higher median income levels exhibit better survival outcomes, suggesting the potential influence of social support and socioeconomic factors on prognosis.8 The primary site of involvement is a crucial determinant of survival, with multiple region involvement associated with an increased risk of death. Moreover, the use of radiotherapy and chemotherapy is linked to improved survival, emphasizing the importance of multimodal treatment approaches in CL management.9,10

Second, by leveraging advanced machine learning techniques, we have not only identified independent prognostic factors specific to composite HL and DLBCL but also developed a scoring nomogram based on these factors. Future prospective studies can utilize this nomogram to validate and refine its application across different populations and treatment settings, thereby enhancing its predictive accuracy and clinical utility. The development of long-term outcome prediction models using ML techniques represents a significant advancement in this field.11,12 Incorporating such models into routine clinical practice may help optimize treatment strategies and improve patient outcomes in CL. The findings also provide valuable insights into the factors that influence short-term (1-year) and long-term (5-year and 10-year) survival outcomes. Factors such as primary tumor site, marital status, chemotherapy, and sex predominantly influence short-term survival, while Ann Arbor stage plays a significant role in long-term survival. Age, SEER stage, and median income emerge as consistent predictors of overall survival across all time intervals. These findings contribute to a better understanding of the factors affecting patient outcomes and can guide clinicians in prognostic assessment and treatment decision-making.

Comparisons between CL, HL, and DLBCL provide further insights into the unique characteristics of CL. CL exhibited an increased incidence with advancing age, while HL predominantly affected younger individuals. This age distribution difference suggests distinct pathogenic mechanisms and risk factors for these lymphomas. The predominance of male patients in all 3 subtypes aligns with previous observations in the literature.7,13,14 The temporal analysis highlighted a recent trend in CL diagnoses, with a scarcity of cases before 2010. In contrast, HL and DLBCL demonstrated a more consistent distribution across different years. This temporal pattern may reflect advancements in diagnostic techniques and the evolving classification and recognition of CL as a distinct entity.15 The varying primary tumor sites observed in CL, HL, and DLBCL further emphasize the heterogeneity and diverse origins of these lymphomas.5 Staging plays a crucial role in predicting prognosis and guiding treatment decisions. In this study, a notable proportion of CL cases had an unknown Ann Arbor stage compared to HL and DLBCL. This finding suggests that further efforts are needed to improve the accurate staging of CL, which may lead to better risk stratification and treatment planning.

The analysis of survival outcomes revealed distinct patterns among the lymphoma subtypes. HL exhibited the highest overall survival rates, followed by CL, while DLBCL had the lowest OS. These results align with the known prognosis of HL and DLBCL.7,16,17 The higher OS among HL patients could be attributed to several factors, including the generally favorable response of HL to treatment modalities. Regarding the rapid decline in OS among CL patients in the early period, this may be linked to the aggressive nature of CL. Furthermore, the incidence of cumulative lymphoma-specific mortalities was significantly lower in CL compared to DLBCL, indicating potentially different underlying mechanisms and biological behavior between these subtypes. However, there was no statistically significant difference in cumulative lymphoma-specific mortalities between CL and HL. After conducting case-control matching based on baseline characteristics, we found that DLBCL had better short-term survival compared to CL, but CL exhibited better long-term survival compared to DLBCL. The rapid decline in OS among CL patients in the early period may be attributed to the fact that CL originates from lymphocyte differentiation at an earlier stage, indicating a more aggressive nature. On the other hand, DLBCL patients experienced a more rapid decline in survival, potentially due to the presence of a significant number of double-expressor DLBCL cases that were misclassified as conventional DLBCL without undergoing confirmatory FISH testing. These patients may have a poor response to standard treatment, leading to shorter survival. These findings underscore the importance of accurate diagnosis and subtyping of different lymphoma subtypes.

We also observed distinct patterns in causes of death and long-term survival outcomes among different lymphoma subtypes. Non-HL, specifically DLBCL, and CL, accounted for the majority of deaths in both subtypes. The highest proportion of deaths occurred within the first year and 1-5 years of diagnosis, suggesting the aggressive nature of these subtypes during the early stages. However, it is noteworthy that the proportion of deaths due to non-HL notably decreased in the 5-10 years and beyond 10 years, indicating a potential shift in the cause of mortality over time. Among HL patients, HL was the primary cause of death, highlighting the distinct nature of this subtype. The survival outcomes of HL patients differed from those of DLBCL and CL patients, emphasizing the importance of subtype-specific considerations in prognosis and treatment strategies. Interestingly, our findings revealed that patients who survived beyond 5 years were more susceptible to developing various comorbidities, such as heart diseases, cerebrovascular diseases, chronic obstructive pulmonary disease, and lung and bronchus diseases. This highlights the importance of long-term follow-up and comprehensive management strategies for lymphoma survivors. The overall survival probabilities at 1-year, 3-year, and 5-year intervals remained relatively stable over time for all 3 lymphoma subtypes. However, the lymphoma-specific survival probabilities demonstrated an increasing trend, indicating advancements in research and treatment modalities specific to lymphoma. These findings reflect the continuous efforts in improving survival rates and outcomes for lymphoma patients.

This study has several limitations worth considering. First, the retrospective nature of the study and reliance on the SEER database may introduce inherent biases and incomplete data. Second, the analysis is limited to the variables available in the database, and other potential prognostic factors including detailed information on treatment regimens, radiotherapy doses, and disease status prior to each treatment may not have been considered. A more comprehensive analysis including these variables would enhance the relevance and utility of the results. Lastly, the study did not explore the molecular characteristics and genetic profiles of CL, which could provide further insights into its distinct nature. Due to the rarity of CL, comprehensive genomic data specific to this condition are not readily available and established databases like the SEER do not include genomic data. However, understanding the importance of this dimension, our medical center has preserved valuable patient samples of CL. We plan to utilize these in future research to explore genomic data, aiming to integrate our findings into subsequent studies.

Conclusions

Our study firstly provides a comprehensive analysis of composite HL and DLBCL, elucidating its unique epidemiological and clinical characteristics, prognostic factors, and long-term outcome prediction models using ML techniques. The findings have important implications for clinical practice, enabling personalized treatment approaches and improving patient outcomes in CL.

Acknowledgments

The authors thank the staff of the SEER program for providing open access to the SEER database.

Author contributions

Ailin Zhao, Xu Sun and Weishi Cheng provided the idea and designed the manuscript. All authors contributed to the conceptualization, writing original draft, and writing—review and editing. All authors contributed to the article and approved the submitted version.

Funding

This work was supported by 1.3.5 Project for Disciplines of Excellence, West China Hospital, Sichuan University (No. ZYJC21007), 1.3.5 Project of High Altitude Medicine (No. GYYX24003), 1.3.5 Project for Artificial Intelligence (No. ZYAI24054, No. ZYAI24039), West China Hospital, Sichuan University, Key Research and Development Program of Sichuan Province (No. 2023YFS0031, No. 2023YFS0306), National Key Research and Development Program of China (No. 2022YFC2502600, 2022YFC2502603).

Conflicts of interest

All authors declare no conflicts of interest in this study.

Data availability

The authors did not take any third-party support in conducting this research, analyzing the data, or preparing the manuscript for submission. The whole process of methodology pertinent to data acquisition is shown in the supplementary material.

Ethical considerations

We obtained permission to access SEER database files pertinent to this study, and there was no personal identification information; hence, informed consent was not required. The West China Hospital of Sichuan University waived the informed consent due to data acquisition, particularly from the SEER database (https://seer.cancer.gov/data/). All the statistical analytical procedures were executed in accordance with the relevant guidelines and regulations (Declaration of Helsinki).

References

1.

Fend
F
,
Quintanilla-Martínez
L
,
Raffeld
M.
Composite lymphoma
.
Hum Pathol
.
2000
;
31
(
5
):
626
-
627
. https://doi.org/

2.

Goyal
G
,
Nguyen
AH
,
Kendric
K
,
Caponetti
GC.
Composite lymphoma with diffuse large B-cell lymphoma and classical Hodgkin lymphoma components: a case report and review of the literature
.
Pathol Res Pract
.
2016
;
212
(
12
):
1179
-
1190
. https://doi.org/

3.

Küppers
R
,
Dührsen
U
,
Hansmann
ML.
Pathogenesis, diagnosis, and treatment of composite lymphomas
.
Lancet Oncol
.
2014
;
15
(
10
):
e435
-
e446
. https://doi.org/

4.

Hicks
EB
,
Rappaport
H
,
Winter
WJ.
Follicular lymphoma; a re-evaluation of its position in the scheme of malignant lymphoma, based on a survey of 253 cases
.
Cancer
.
1956
;
9
(
4
):
792
-
821
. https://doi.org/

5.

Kim
H.
Composite lymphoma and related disorders
.
Am J Clin Pathol
.
1993
;
99
(
4
):
445
-
451
. https://doi.org/

6.

Huang
Q
,
Wilczynski
SP
,
Chang
KL
,
Weiss
LM.
Composite recurrent hodgkin lymphoma and diffuse large B-cell lymphoma: one clone, two faces
.
Am J Clin Pathol
.
2006
;
126
(
2
):
222
-
229
. https://doi.org/

7.

Tao
Y
,
Chen
H
,
Liu
D
,
Dai
X.
Survival among patients with composite and sequential lymphoma between primary mediastinal lymphoma/diffuse large B-cell lymphoma and classical Hodgkin lymphoma: a population-based study
.
Leuk Res
.
2021
;
111
(
1
):
106669
. https://doi.org/

8.

Gao
J
,
Chen
Y
,
Wu
P
, et al.
Causes of death and effect of non-cancer-specific death on rates of overall survival in adult classic Hodgkin lymphoma: a populated-based competing risk analysis
.
BMC Cancer
.
2021
;
21
(
1
):
955
. https://doi.org/

9.

Kim
TH
,
Kim
JS
,
Suh
YG
, et al.
The roles of radiotherapy and chemotherapy in the era of multimodal treatment for early-stage nasal-type extranodal natural killer/T-cell lymphoma
.
Yonsei Med J
.
2016
;
57
(
4
):
846
-
854
. https://doi.org/

10.

Kröger
K
,
Siats
J
,
Kerkhoff
A
, et al.
Long-term survival of patients with mantle cell lymphoma after total body irradiation, high-dose chemotherapy and stem cell transplantation: a monocenter study
.
Cancers (Basel)
.
2023
;
15
(
3
):
983
. https://doi.org/

11.

Yu
Y
,
Xu
Z
,
Shao
T
, et al.
Epidemiology and a predictive model of prognosis index based on machine learning in primary breast lymphoma: population-Based Study
.
JMIR Public Health Surveill
.
2023
;
9
(
1
):
e45455
. https://doi.org/

12.

Kong
H
,
Zhu
H
,
Zheng
X
, et al.
Machine learning models for the diagnosis and prognosis prediction of high-grade B-cell lymphoma
.
Front Immunol
.
2022
;
13
(
1
):
919012
. https://doi.org/

13.

Pilichowska
M
,
Pittaluga
S
,
Ferry
JA
, et al.
Clinicopathologic consensus study of gray zone lymphoma with features intermediate between DLBCL and classical HL
.
Blood Adv
.
2017
;
1
(
26
):
2600
-
2609
. https://doi.org/

14.

Evens
AM
,
Kanakry
JA
,
Sehn
LH
, et al.
Gray zone lymphoma with features intermediate between classical Hodgkin lymphoma and diffuse large B-cell lymphoma: characteristics, outcomes, and prognostication among a large multicenter cohort
.
Am J Hematol
.
2015
;
90
(
9
):
778
-
783
. https://doi.org/

15.

Kim
M
,
Hwang
HS
,
Yoon
DH
,
Chun
SM
,
Go
H.
Distinct genetic alterations in Burkitt-like lymphoma with 11q aberration and Burkitt lymphoma: a novel case report of composite lymphoma
.
Haematologica
.
2022
;
107
(
8
):
1999
-
2003
. https://doi.org/

16.

Shanbhag
S
,
Ambinder
RF.
Hodgkin lymphoma: a review and update on recent progress
.
CA Cancer J Clin
.
2018
;
68
(
2
):
116
-
132
. https://doi.org/

17.

Sehn
LH
,
Hertzberg
M
,
Opat
S
, et al.
Polatuzumab vedotin plus bendamustine and rituximab in relapsed/refractory DLBCL: survival update and new extension cohort data
.
Blood Adv
.
2022
;
6
(
2
):
533
-
543
. https://doi.org/

Author notes

Ailin Zhao, Xu Sun and Weishi Cheng Contributed equally.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].