Near-term prediction of sustained ventricular arrhythmias applying artificial intelligence to single-lead ambulatory electrocardiogram

Author Notes

Abstract

Background and Aims

Accurate near-term prediction of life-threatening ventricular arrhythmias would enable pre-emptive actions to prevent sudden cardiac arrest/death. A deep learning–enabled single-lead ambulatory electrocardiogram (ECG) may identify an ECG profile of individuals at imminent risk of sustained ventricular tachycardia (VT).

Methods

This retrospective study included 247 254, 14 day ambulatory ECG recordings from six countries. The first 24 h were used to identify patients likely to experience sustained VT occurrence (primary outcome) in the subsequent 13 days using a deep learning–based model. The development set consisted of 183 177 recordings. Performance was evaluated using internal (n = 43 580) and external (n = 20 497) validation data sets. Saliency mapping visualized features influencing the model’s risk predictions.

Results

Among all recordings, 1104 (.5%) had sustained ventricular arrhythmias. In both the internal and external validation sets, the model achieved an area under the receiver operating characteristic curve of .957 [95% confidence interval (CI) .943–.971] and .948 (95% CI .926–.967). For a specificity fixed at 97.0%, the sensitivity reached 70.6% and 66.1% in the internal and external validation sets, respectively. The model accurately predicted future VT occurrence of recordings with rapid sustained VT (≥180 b.p.m.) in 80.7% and 81.1%, respectively, and 90.0% of VT that degenerated into ventricular fibrillation. Saliency maps suggested the role of premature ventricular complex burden and early depolarization time as predictors for VT.

Conclusions

A novel deep learning model utilizing dynamic single-lead ambulatory ECGs accurately identifies patients at near-term risk of ventricular arrhythmias. It also uncovers an early depolarization pattern as a potential determinant of ventricular arrhythmias events.

Structured Graphical Abstract

Overview of the development and validation of the deep learning–based model. Prediction of near-term sustained ventricular tachycardia (VT) using a single-lead 24 h electrocardiogram (ECG) achieved an area under the receiver operating characteristic curve (AUROC) of .957 and .948 in the internal and external validation data sets, respectively. Model explainability confirmed the role of premature ventricular complex (PVC) burden and identified early depolarization time as potential determinants for VT. CNN, convolutional neural network.

Open in new tab Download slide

Ventricular tachycardia, Prediction, Machine learning, Sudden cardiac death, Holter-ECG

See the editorial comment for this article ‘Predicting imminent ventricular arrhythmias from ambulatory ECG signals: far-reaching or too far to reach?’, by K. C. Siontis and P. A. Friedman, https://doi.org/10.1093/eurheartj/ehaf008.

Translational perspective

The potential applications of this deep learning model extend beyond traditional clinical settings. It paves the way for new real-time monitoring tools, which could be integrated as artificial intelligence–based ‘smart-monitoring’ systems.
The performance of this model using a ubiquitous single-lead electrocardiogram suggests opportunities for integration with wearable devices like smartwatches and implantable loop recorders. These innovations hold promise for remote patient monitoring and pre-emptive interventions, transforming the landscape of sudden cardiac death risk management and potentially improving patient outcomes.

Introduction

More than 40 years after the first implantable cardioverter defibrillator (ICD) implantation, sudden cardiac death (SCD) still accounts for >5 million deaths worldwide every year, with a majority occurring in the general population, among subjects without known heart disease.^1,2 Ventricular tachycardia/fibrillation (VT/VF) represents one of the main mechanisms of SCD, with coronary artery disease being the aetiology in up to 80% of cases.³ The incidence of SCD has remained disappointingly stable over time, despite major efforts deployed in the field towards prevention.⁴

Sudden cardiac death is a result of a dynamic complex process acting on a specific ventricular substrate that still remains incompletely understood. The prevention of SCD is traditionally based on the mid- and long-term prediction of life-threatening arrhythmias, with left ventricular ejection fraction being the cornerstone parameter used in clinical practice.⁵ The limited accuracy of this approach reflects the problem of using a fixed and non-specific structural parameter at a given time for long-term risk stratification.^6,7 It also only assesses the substrate, neglecting dynamic aspects of arrhythmia pathophysiology, including the autonomic nervous system as well as triggers, such as premature ventricular complexes (PVCs).^8,9 Therefore, the rationale for an alternative dynamic approach that would identify vulnerable subjects at high risk of SCD at near-term (within minutes, hours, or days prior to the potentially fatal event) is particularly appealing.¹

In such a setting, artificial intelligence (AI), particularly deep learning,¹⁰ has shown the potential to detect subtle patterns indiscernible to the human eye and may thus help refine and improve the accuracy of risk assessment. Artificial intelligence applications have already demonstrated success, for instance, in predicting the risk of atrial fibrillation from sinus rhythm electrocardiograms (ECGs).^11,12 Furthermore, in contrast to the traditional black box perception surrounding AI, wherein it is considered that the logic behind AI-based predictions cannot be understood, the use of interpretability analysis methodology can possibly provide important insight into mechanisms of arrhythmogenesis.¹³ This study hypothesizes that a 24 h single-lead ECG recording contains key information that can be used by AI to identify subjects at imminent risk of life-threatening ventricular arrhythmias in the following days. This would enable prompt pre-emptive actions and enhance near-term SCD prevention.

Methods

Data sources and study setting

The study protocol was approved by the local Institutional Review Board, and the need for individual informed consent was waived.

In this retrospective international study, we developed and validated a deep learning–based model to predict the near-term risk of sustained VT from a single-lead ambulatory ECG. We used 14 day ambulatory ECG recordings to derive the model input and outputs. The first 24 h of each recording (which had no sustained VT) were used as input to a deep learning model. We then labelled each recording according to whether there was any sustained VT documented in the subsequent 13 days and used it as the output (Figure 1A).

Figure 1

Study design and deep learning–based model. (A) Example heart rate density plot of an ambulatory electrocardiogram recording with no ventricular tachycardia in the first 24 h and an episode of ventricular tachycardia degrading to ventricular fibrillation on Day 4. The first 24 h of the recording are used to derive inputs to the deep learning–based model, while the remaining duration is used to label whether sustained ventricular tachycardia occurred in the following days. (B) Patient age, sex, and various electrocardiogram measurements extracted from the first 24 h are passed to an encoder to generate a measurement embedding. (C) An heart rate density plot is constructed from the first 24 h and passed to a convolutional neural network to extract spatial feature maps, which are passed to a transformer encoder to generate a heart rate density plot embedding. (D) A collection of 10 s electrocardiogram strips is sampled from the 24 h recording and passed to a convolutional neural network to extract features from each strip and then aggregated using a transformer encoder to generate an electrocardiogram waveform embedding. The embeddings generated from each input are fused and passed to a classier to predict a near-term ventricular tachycardia risk score

Open in new tab Download slide

All data included in the study were acquired from individuals receiving routine continuous cardiac monitoring and uploaded to the Cardiologs Holter platform for analysis. The model was developed and internally validated using an internal data set consisting of ambulatory ECG recordings collected from various Independent Diagnostic Testing Facilities (IDTFs) and centres across five countries (USA, UK, France, South Africa, and India) between 1 January 2019 and 1 January 2024 (see Supplementary data online, Table S5). The internal data set was divided into a development and held-out validation set in the following way: all recordings collected before 1 July 2023 were used for model development (80%) and all collected thereafter were held out for internal validation (20%). The development data set was randomly split into a training (80%) and tuning set (20%). The tuning set was used to select hyperparameters and operating points from the training process.

To assess the generalizability of the model across different sources with different patient populations and data collection strategies, we validated the model using a fully independent external validation data set. The external validation set consisted of recordings collected from two separate IDTFs (USA and Czech Republic) between 1 January 2019 and 1 January 2024. All individuals in the external validation set were excluded from model development.

Ambulatory recordings were collected from numerous manufacturers, which consisted of single and multi-lead patch-based and traditional Holter monitors (see Supplementary data online, Table S7). Due to the variability in electrode positioning across patch-based and multi-lead Holters, we did not select a specific lead derivation; instead, the first available lead was used. All ECG recordings were stored in digital format and resampled to 250 Hz.

Since the study was conducted retrospectively, the race or ethnicity of the patients was not consistently documented during the acquisition of the data. As a result, we do not possess specific statistics on these demographics. However, we made efforts to include individuals from various regions across four continents to ensure a diverse representation of the population.

Outcome definition

The primary outcome was defined as the occurrence of sustained ventricular arrhythmias during the immediate 13 days following a 24 h ambulatory ECG recording. All recordings included in the study were analysed by physicians or certified ECG technicians using the Cardiologs Holter analysis platform (see Supplementary data online, Table S6). Sustained VT was defined as a ventricular rhythm lasting ≥30 s with a rate of ≥100 b.p.m., in accordance with guidelines.^14,15 Two certified academic electrophysiologists, blinded to the model predictions, have verified the data. All documented sustained VT episodes were reviewed and adjudicated centrally by two experts in ECG interpretation. The opinion of a third experienced cardiac electrophysiologist was requested in case of discrepancy to limit the impact of inter-rater variability. For the ‘non-VT’ recordings, a random verification of 3800 Holter recordings was conducted by two certified academic electrophysiologists, with no sustained VT identified.

Deep learning–based model

We developed a deep learning–based model (Figure 1), which utilizes three different modalities derived from a single-lead ambulatory ECG to predict the risk of VT. The model consists of three separate branches, which use as input: (i) patient demographics and quantitative measurements calculated from the recording (Figure 1B); (ii) a heart rate density plot (HRDP) (Figure 1C); and (iii) the ECG waveform (Figure 1D). It is trained using a co-learning approach by extracting features from each modality and learning interactions between them. Embeddings (i.e. high-level features) are learnt from the quantitative measurements, HRDP, and raw ECG waveform in parallel (Figure 1B–D). The embeddings extracted from each input are then fused and passed to three fully connected layers to aggregate the features. The final layer consists of a sigmoid activation function and outputs a probability of VT occurring in the following days. The three branches of the model used for feature extraction are described below. It is important to note that the model prediction ultimately relies exclusively on the raw ECG signal, with the exception of age and sex, and therefore does not require a preliminary analysis by a human or any other software.

To capture information related to potential triggers, we utilized clinical data and quantitative measurements derived from the ECG recordings, which have been previously associated with VT risk (Figure 1B). Additionally, to obtain a more comprehensive view of the global rhythm profile over an extended period, we introduced a new representation of the 24 h recording in the form of an HRDP. The clinical data consist of patient age and sex, and the measurements included 19 parameters: PVC burden (%), repetitive character (number of consecutive PVC, mean, and standard deviation), coupling interval between sinus beats and PVCs (mean and standard deviation), QRS duration (mean and standard deviation), non-sustained VT characteristics [count, maximum heart rate (HR), and longest duration], and HR variability (HRV, including SDNN, SDANN, SDNNI, pNN50, RMSSD, and HF power), premature atrial complex (PAC) burden (%), and counts of PVC couplets, bigeminy, and trigeminy. The identification of sinus beats, PACs, PVCs, and VT was performed by the Cardiologs algorithm, as described in the previous work.¹⁶ The clinical data and measurements were normalized and provided to two fully connected layers to generate a feature representation.

An HRDP is a 3D representation of the instantaneous HR during the monitoring period (Figure 1C). The x-axis represents time, the y-axis represents HR, and the z-axis consists of three channels corresponding to each beat’s classification as either: sinus, PVC, or PAC. The HRDP backbone consisted of a convolutional neural network (CNN), a transformer encoder, and an attention-based pooling layer. The CNN is based on the ResNet-50 architecture,¹⁷ which takes as input an HRDP to extract spatial feature maps from the recording. An attention-pooling layer¹⁸ was used to aggregate spatial feature maps across the HR-axis (y-axis) resulting in temporal feature maps. A transformer encoder, composed of four multi-headed attention layers,¹⁹ is used to exploit relationships across the temporal features. A final attention-pooling layer is then used to generate a global feature representation of the HRDP.

To capture information related to the arrhythmogenic underlying substrate, we applied a deep neural network directly to the ECG waveform data (Figure 1D). The backbone consists of an ECG-strip encoder, a transformer encoder, and an attention-based pooling layer. The ECG-strip encoder is a CNN based on the ResNest-50 architecture,²⁰ which takes as input a 10 s single-lead ECG waveform strip to generate a single strip embedding. During training, 30 min of ECG signal is sampled from each recording by randomly selecting 180 strips with the temporal order preserved. Each strip is passed to the CNN backbone to generate 180 strip embeddings. The transformer encoder, composed of two multi-headed attention layers,¹⁹ is used to aggregate information from the strip embeddings. An attention-pooling layer is then used to assign an attention score to each strip and generate a feature representation of the ECG waveform.¹⁸ Training details, data augmentation methods, and oversampling strategies are described in the Supplementary data online, Appendix.

Secondary analyses

As a secondary analysis, we evaluated the performance of the model on shorter prediction horizons of 7, 3, 2, 1 day, and 1 h. At each prediction horizon, we took the preceding 24 h of all positive recordings that ended within the time horizon of the first sustained VT episode. For example, to evaluate the algorithm’s ability to predict 1 h VT risk, we used the 24 h of a recording that ended within 1 h of the first VT onset. The original predictions of all negative recordings remained fixed for each horizon. Additionally, we evaluated the effect that monitoring duration has on predicting the risk of VT by training models with the same architecture using shorter input durations of 1, 3, 6, and 12 h. Lastly, for comparison, we developed a baseline multivariable logistic regression model using patient age, sex, PVC burden, HR, QRS duration, QTc, and HRV SDNN. Since there is no widely recognized reference for calculating VT risk from Holter ECG data, these metrics were chosen based on their known potential for determining VT risk, although this list of metrics is not exhaustive.

Interpretability

We generated saliency maps using gradient-weighted class activation mapping and integrated gradients on the HRDP and ECG waveform branches of the model,^21,22 respectively. Both are gradient-based methods which highlight regions of the input that have strong influence on the prediction of VT. For qualitative comparison, we randomly selected 20 samples from each classification group (true positive, true negative, false positive, and false negative) from the validation set. We then asked an experienced electrophysiologist to describe differences among the saliency maps of the classification groups.

Statistical analysis

The performance of the model was evaluated using the area under the receiver operating characteristic curve (AUROC), area under the precision–recall curve (AUPRC, also known as average precision), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). To estimate 95% confidence intervals (CIs), we used non-parametric bootstrapping with 1000 samples. Sensitivity, specificity, PPV, and NPV were calculated at binary decision thresholds. The optimal decision threshold for near-term VT risk was calibrated using the receiver operating characteristic curve and F₂ score on the tuning set. Subgroup analyses were performed on the validation sets to evaluate model fairness across patient demographics. We compared AUROCs across age and sex in a pairwise approach with the DeLong method.²³ We considered two-sided P-values <.05 statistically significant. All models and statistics were computed using Python (v.3.8.12). Deep learning models were trained using the PyTorch (v1.7.0). Data analysis was performed using numpy (1.19.5), pandas (1.2.0), scipy (1.6.0), and scikit-learn (0.24.0). For data visualization and scientific plotting, matplotlib (3.2.2) and seaborn (0.12.2) were used.

Results

A total of 247 254, 14 day ambulatory ECG recordings collected between 2019 and 2024 from individuals aged ≥18 years were included in the study (Table 1 and Supplementary data online, Figure S1). Of these, 1104 (.5%) had at least one sustained VT episode documented during the monitoring period (beyond the 24 h used as input to the model), with 43 (3.9%) being polymorphic VT. Notably, 22 recordings presented VT degenerating into VF. Among the 917 (83.1%) patients with sustained VT documented during monitoring, the most common indications for monitoring were 289 (31.5%) palpitations, 153 (16.7%) PVC, 98 (10.7%) atrial fibrillation, 76 (8.3%) VT, 52 (5.7%) syncope, and 249 (27.1%) for other reasons. Additional VT characteristics from each data set are detailed in Table 1. The remaining recordings had no documented sustained VT during the 14 day monitoring period.

Table 1

Open in new tab

Baseline and indication characteristics

	Training			Internal validation			External validation
	Control	VT	P-value	Control	VT	P-value	Control	VT	P-value
n	183 177	804		43 580	197		20 497	103
Demographics
Age (years)	61.1 ± 17.8	63.4 ± 15.3	<.001	62.2 ± 18.5	63.2 ± 16.4	.442	61.3 ± 15.3	66.5 ± 12.7	<.001
Female (%)	108 951 (59.5)	244 (30.3)	<.001	26 887 (61.7)	54 (27.4)	<.001	12 062 (58.8)	21 (20.4)	<.001
ECG measurements
QRS (ms)	100 ± 18	107 ± 22	<.001	98 ± 17	107 ± 20	<.001	97 ± 16	105 ± 21	<.001
QTc (ms)	401 ± 38	409 ± 41	<.001	400 ± 37	410 ± 40	<.001	397 ± 34	409 ± 41	<.001
PVC burden, median (IQR)	0.7 (0.2–3.1)	8.9 (2.1–17.8)	<.001	0.6 (0.2–2.9)	12.0 (3.4–23.5)	<.001	0.8 (0.2–3.5)	9.0 (2.4–21.5)	.297
PVC coupling interval (ms)	590 ± 119	521 ± 106	<.001	576 ± 113	511 ± 73	<.001	585 ± 108	525 ± 101	<.001
HR (b.p.m.)	74 ± 11	84 ± 26	<.001	75 ± 11	81 ± 16	<.001	77 ± 15	84 ± 50	<.001
HRV SDNN	153 ± 48	160 ± 51	.022	147 ± 49	152 ± 61	.257	144 ± 51	161 ± 59	.049
Indication, n (%)
Palpitations	71 648 (39.1)	216 (26.9)	<.001	16 981 (39.0)	48 (24.4)	<.001	7194 (35.1)	21 (20.4)	.003
Atrial fibrillation	32 499 (17.7)	86 (10.7)	<.001	6869 (15.8)	6 (3.0)	<.001	3239 (15.8)	17 (16.5)	.952
Syncope	23 089 (12.6)	59 (7.3)	<.001	5468 (12.5)	19 (9.6)	.263	3330 (16.2)	10 (9.7)	.097
Arrhythmia	8790 (4.8)	47 (5.8)	.193	1897 (4.4)	12 (6.1)	.309	1093 (5.3)	8 (7.8)	.381
Bradycardia	6461 (3.5)	30 (3.7)	.828	1721 (3.9)	11 (5.6)	.322	757 (3.7)	6 (5.8)	.378
PVC	3679 (2.0)	114 (14.2)	<.001	901 (2.1)	44 (22.3)	<.001	304 (1.5)	14 (13.6)	<.001
VT	1165 (0.6)	57 (7.1)	<.001	286 (0.7)	21 (10.7)	<.001	111 (0.5)	14 (13.6)	<.001
Other	35 846 (19.6)	195 (24.3)	.001	9457 (21.7)	36 (18.3)	.281	4469 (21.8)	13 (12.6)	.033

	Training			Internal validation			External validation
	Control	VT	P-value	Control	VT	P-value	Control	VT	P-value
n	183 177	804		43 580	197		20 497	103
Demographics
Age (years)	61.1 ± 17.8	63.4 ± 15.3	<.001	62.2 ± 18.5	63.2 ± 16.4	.442	61.3 ± 15.3	66.5 ± 12.7	<.001
Female (%)	108 951 (59.5)	244 (30.3)	<.001	26 887 (61.7)	54 (27.4)	<.001	12 062 (58.8)	21 (20.4)	<.001
ECG measurements
QRS (ms)	100 ± 18	107 ± 22	<.001	98 ± 17	107 ± 20	<.001	97 ± 16	105 ± 21	<.001
QTc (ms)	401 ± 38	409 ± 41	<.001	400 ± 37	410 ± 40	<.001	397 ± 34	409 ± 41	<.001
PVC burden, median (IQR)	0.7 (0.2–3.1)	8.9 (2.1–17.8)	<.001	0.6 (0.2–2.9)	12.0 (3.4–23.5)	<.001	0.8 (0.2–3.5)	9.0 (2.4–21.5)	.297
PVC coupling interval (ms)	590 ± 119	521 ± 106	<.001	576 ± 113	511 ± 73	<.001	585 ± 108	525 ± 101	<.001
HR (b.p.m.)	74 ± 11	84 ± 26	<.001	75 ± 11	81 ± 16	<.001	77 ± 15	84 ± 50	<.001
HRV SDNN	153 ± 48	160 ± 51	.022	147 ± 49	152 ± 61	.257	144 ± 51	161 ± 59	.049
Indication, n (%)
Palpitations	71 648 (39.1)	216 (26.9)	<.001	16 981 (39.0)	48 (24.4)	<.001	7194 (35.1)	21 (20.4)	.003
Atrial fibrillation	32 499 (17.7)	86 (10.7)	<.001	6869 (15.8)	6 (3.0)	<.001	3239 (15.8)	17 (16.5)	.952
Syncope	23 089 (12.6)	59 (7.3)	<.001	5468 (12.5)	19 (9.6)	.263	3330 (16.2)	10 (9.7)	.097
Arrhythmia	8790 (4.8)	47 (5.8)	.193	1897 (4.4)	12 (6.1)	.309	1093 (5.3)	8 (7.8)	.381
Bradycardia	6461 (3.5)	30 (3.7)	.828	1721 (3.9)	11 (5.6)	.322	757 (3.7)	6 (5.8)	.378
PVC	3679 (2.0)	114 (14.2)	<.001	901 (2.1)	44 (22.3)	<.001	304 (1.5)	14 (13.6)	<.001
VT	1165 (0.6)	57 (7.1)	<.001	286 (0.7)	21 (10.7)	<.001	111 (0.5)	14 (13.6)	<.001
Other	35 846 (19.6)	195 (24.3)	.001	9457 (21.7)	36 (18.3)	.281	4469 (21.8)	13 (12.6)	.033

HR, heart rate; HRV, heart rate variability; IQR, interquartile range; PVC, premature ventricular complex; SDNN, standard deviation of NN intervals; VT, ventricular tachycardia.

Table 1

Open in new tab

Baseline and indication characteristics

	Training			Internal validation			External validation
	Control	VT	P-value	Control	VT	P-value	Control	VT	P-value
n	183 177	804		43 580	197		20 497	103
Demographics
Age (years)	61.1 ± 17.8	63.4 ± 15.3	<.001	62.2 ± 18.5	63.2 ± 16.4	.442	61.3 ± 15.3	66.5 ± 12.7	<.001
Female (%)	108 951 (59.5)	244 (30.3)	<.001	26 887 (61.7)	54 (27.4)	<.001	12 062 (58.8)	21 (20.4)	<.001
ECG measurements
QRS (ms)	100 ± 18	107 ± 22	<.001	98 ± 17	107 ± 20	<.001	97 ± 16	105 ± 21	<.001
QTc (ms)	401 ± 38	409 ± 41	<.001	400 ± 37	410 ± 40	<.001	397 ± 34	409 ± 41	<.001
PVC burden, median (IQR)	0.7 (0.2–3.1)	8.9 (2.1–17.8)	<.001	0.6 (0.2–2.9)	12.0 (3.4–23.5)	<.001	0.8 (0.2–3.5)	9.0 (2.4–21.5)	.297
PVC coupling interval (ms)	590 ± 119	521 ± 106	<.001	576 ± 113	511 ± 73	<.001	585 ± 108	525 ± 101	<.001
HR (b.p.m.)	74 ± 11	84 ± 26	<.001	75 ± 11	81 ± 16	<.001	77 ± 15	84 ± 50	<.001
HRV SDNN	153 ± 48	160 ± 51	.022	147 ± 49	152 ± 61	.257	144 ± 51	161 ± 59	.049
Indication, n (%)
Palpitations	71 648 (39.1)	216 (26.9)	<.001	16 981 (39.0)	48 (24.4)	<.001	7194 (35.1)	21 (20.4)	.003
Atrial fibrillation	32 499 (17.7)	86 (10.7)	<.001	6869 (15.8)	6 (3.0)	<.001	3239 (15.8)	17 (16.5)	.952
Syncope	23 089 (12.6)	59 (7.3)	<.001	5468 (12.5)	19 (9.6)	.263	3330 (16.2)	10 (9.7)	.097
Arrhythmia	8790 (4.8)	47 (5.8)	.193	1897 (4.4)	12 (6.1)	.309	1093 (5.3)	8 (7.8)	.381
Bradycardia	6461 (3.5)	30 (3.7)	.828	1721 (3.9)	11 (5.6)	.322	757 (3.7)	6 (5.8)	.378
PVC	3679 (2.0)	114 (14.2)	<.001	901 (2.1)	44 (22.3)	<.001	304 (1.5)	14 (13.6)	<.001
VT	1165 (0.6)	57 (7.1)	<.001	286 (0.7)	21 (10.7)	<.001	111 (0.5)	14 (13.6)	<.001
Other	35 846 (19.6)	195 (24.3)	.001	9457 (21.7)	36 (18.3)	.281	4469 (21.8)	13 (12.6)	.033

	Training			Internal validation			External validation
	Control	VT	P-value	Control	VT	P-value	Control	VT	P-value
n	183 177	804		43 580	197		20 497	103
Demographics
Age (years)	61.1 ± 17.8	63.4 ± 15.3	<.001	62.2 ± 18.5	63.2 ± 16.4	.442	61.3 ± 15.3	66.5 ± 12.7	<.001
Female (%)	108 951 (59.5)	244 (30.3)	<.001	26 887 (61.7)	54 (27.4)	<.001	12 062 (58.8)	21 (20.4)	<.001
ECG measurements
QRS (ms)	100 ± 18	107 ± 22	<.001	98 ± 17	107 ± 20	<.001	97 ± 16	105 ± 21	<.001
QTc (ms)	401 ± 38	409 ± 41	<.001	400 ± 37	410 ± 40	<.001	397 ± 34	409 ± 41	<.001
PVC burden, median (IQR)	0.7 (0.2–3.1)	8.9 (2.1–17.8)	<.001	0.6 (0.2–2.9)	12.0 (3.4–23.5)	<.001	0.8 (0.2–3.5)	9.0 (2.4–21.5)	.297
PVC coupling interval (ms)	590 ± 119	521 ± 106	<.001	576 ± 113	511 ± 73	<.001	585 ± 108	525 ± 101	<.001
HR (b.p.m.)	74 ± 11	84 ± 26	<.001	75 ± 11	81 ± 16	<.001	77 ± 15	84 ± 50	<.001
HRV SDNN	153 ± 48	160 ± 51	.022	147 ± 49	152 ± 61	.257	144 ± 51	161 ± 59	.049
Indication, n (%)
Palpitations	71 648 (39.1)	216 (26.9)	<.001	16 981 (39.0)	48 (24.4)	<.001	7194 (35.1)	21 (20.4)	.003
Atrial fibrillation	32 499 (17.7)	86 (10.7)	<.001	6869 (15.8)	6 (3.0)	<.001	3239 (15.8)	17 (16.5)	.952
Syncope	23 089 (12.6)	59 (7.3)	<.001	5468 (12.5)	19 (9.6)	.263	3330 (16.2)	10 (9.7)	.097
Arrhythmia	8790 (4.8)	47 (5.8)	.193	1897 (4.4)	12 (6.1)	.309	1093 (5.3)	8 (7.8)	.381
Bradycardia	6461 (3.5)	30 (3.7)	.828	1721 (3.9)	11 (5.6)	.322	757 (3.7)	6 (5.8)	.378
PVC	3679 (2.0)	114 (14.2)	<.001	901 (2.1)	44 (22.3)	<.001	304 (1.5)	14 (13.6)	<.001
VT	1165 (0.6)	57 (7.1)	<.001	286 (0.7)	21 (10.7)	<.001	111 (0.5)	14 (13.6)	<.001
Other	35 846 (19.6)	195 (24.3)	.001	9457 (21.7)	36 (18.3)	.281	4469 (21.8)	13 (12.6)	.033

HR, heart rate; HRV, heart rate variability; IQR, interquartile range; PVC, premature ventricular complex; SDNN, standard deviation of NN intervals; VT, ventricular tachycardia.

The development set used to train the model consisted of 183 177 ambulatory ECG recordings. On the internal validation set (n = 43 580), the deep learning–based model achieved an AUROC of .957 (95% CI .943–.971) and AUPRC of .300 (95% CI .239–.376; Figure 2A and B). With a fixed operating point, the sensitivity, specificity, PPV, and NPV were 70.6% (95% CI 64.2%–77.2%), 97.7% (95% CI 97.6%–97.9%), 12.3% (95% CI 10.6%–14.4%), and 99.9% (95% CI 99.8%–99.9%), respectively. On the external validation data set (n = 20 497), the model yielded an AUROC of .948 (95% CI .926–.967) and AUPRC of .269 (95% CI .189–.362). The sensitivity, specificity, PPV, and NPV were 66.1% (95% CI 57.4%–75.2%), 97.0% (95% CI 96.8%–97.3%), 10.1% (95% CI 7.9%–12.6%), and 99.8% (95% CI 9.8%–99.9%), respectively. For comparison, we evaluated the performance of a multivariable logistic regression model using patient age, sex, QRS duration, QTc interval, PVC burden, HR, and HRV SDNN. On the internal and external validation sets, the reference model yielded an AUROC of .845 (95% CI .806–.879) and .847 (95% CI .817–.876), AUPRC of .05 (95% CI .026–.094) and .039 (95% CI .031–.056), sensitivity of 49.7% (95% CI 41.7%–57.6%) and 45.6% (95% CI 34.7%–56.4%), and specificity of 88.2% (95% CI 87.8%–88.6%) and 90% (95% CI 89.5%–90.6%). Performance when using different input combinations is provided in Supplementary data online, Table S1 and confusion matrices in Supplementary data online, Tables S2 and S3.

Figure 2

Model performance. (A) Receiver operating characteristic curves of the deep learning–based model on the internal and external validation sets. (B) Precision–recall curves of the deep learning–based model on the internal and external validation set. The curves show the trade-off between sensitivity and positive predictive value. (C) Area under the receiver operating characteristic curve of the model when evaluated at various prediction horizons. Each model was compared with the respective 13 days prediction horizon model on the external validation set using the DeLong method. Significance levels are denoted as ns (not significant) for P ≥ .05; * for P < .05, and ** for P < .001. Using bootstrapping with 1000 samples, 95% confidence intervals were computed. Error bars indicate 95% confidence intervals. (D) Area under the receiver operating characteristic curve of the model when using shorter input durations to predict the risk of ventricular tachycardia. Each model was compared with the respective 24 h model on the external validation set using the DeLong method. Significance levels are denoted as ns (not significant) for P ≥ .05; * for P < .05, and ** for P < .001. Using bootstrapping with 1000 samples, 95% confidence intervals were computed. Error bars indicate 95% confidence intervals

Open in new tab Download slide

We observed consistent performance improvements when evaluating the model’s ability to predict VT risk at prediction horizons shorter than 2 days, compared with the 13 day prediction, with a significant difference in AUROCs (P < .005; Figure 2C). While using 24 h of monitoring to predict 3 day VT risk, the AUROC improved to .96 (95% CI .948–.972) and .952 (95% CI .933–.974) on the internal and external validation sets, respectively. To predict the occurrence of VT in the very next hour, the model achieved AUROCs of .970 (95% CI .960–.982) and .961 (95% CI .946–.966). Additionally, there was a significant drop in performance for both internal and external validation sets when using an input duration <6 h compared with the 24 h model (P < .001; Figure 2D).

We evaluated model performance across subgroups of patient age and sex (Table 2). The model showed consistent performance for both sexes (P = .84). Model performance was comparatively lower in older patients compared with younger patients (P = .15). We also analysed the performance according to VT rate, notably using the internal and external validation data sets, which consisted of 57 (28.9%) and 37 (35.9%) recordings with rapid VT (≥180 b.p.m.), respectively (Table 3). We observed that the model correctly predicted VT occurrence in 80.7% and 81.1% of recordings with rapid VT (≥180 b.p.m.) on the internal and external validation sets, respectively (see Supplementary data online, Table S4). Notably, the model identified 9 of the 10 recordings where VT degenerated into VF among the validation sets.

Table 2

Open in new tab

Internal and external validation performance by subgroups

	n	n VT	AUROC (95% CI)	AUPRC (95% CI)	Sens. (95% CI)	Spe. (95% CI)	PPV (95% CI)	NPV (95% CI)
Internal validation
Age
18–65	20 789	97	97.1 (95.3–98.6)	44.0 (34.0–54.8)	74.2 (64.8–82.5)	98.6 (98.4–98.7)	19.4 (15.3–23.3)	99.9 (99.8–99.9)
≥65	22 988	100	94.2 (91.7–96.3)	18.1 (12.7–27.4)	67.0 (57.3–75.8)	97.0 (96.8–97.2)	8.9 (7.0–11.0)	99.9 (99.8–99.9)
Sex
Male	16 836	143	95.0 (93.4–96.4)	32.7 (25.3–41.2)	71.3 (63.6–78.5)	96.1 (95.8–96.4)	13.6 (11.4–16.2)	99.7 (99.7–99.8)
Female	26 941	54	94.6 (90.8–97.8)	26.2 (15.3–39.0)	68.5 (55.3–80.9)	98.7 (98.6–98.9)	9.8 (7.0–13.1)	99.9 (99.9–100.0)
External validation
Age
18–65	10 634	41	95.1 (91.6–98.2)	33.1 (19.7–48.0)	63.4 (48.6–77.6)	98.8 (98.5–99.0)	16.5 (10.9–22.3)	99.9 (99.8–99.9)
≥65	9966	62	94.0 (91.4–96.2)	24.3 (14.7–37.0)	45.2 (32.3–57.6)	97.8 (97.5–98.1)	11.6 (7.8–15.8)	99.7 (99.5–99.8)
Sex
Male	8517	82	93.2 (90.7–95.5)	28.8 (19.5–39.4)	50.0 (39.4–60.5)	97.2 (96.8–97.5)	14.7 (10.7–19.2)	99.5 (99.3–99.7)
Female	12 083	21	94.5 (88.4–99.1)	20.9 (9.6–41.3)	61.9 (40.0–81.2)	99.1 (98.9–99.3)	10.7 (5.2–16.7)	99.9 (99.9–100.0)

	n	n VT	AUROC (95% CI)	AUPRC (95% CI)	Sens. (95% CI)	Spe. (95% CI)	PPV (95% CI)	NPV (95% CI)
Internal validation
Age
18–65	20 789	97	97.1 (95.3–98.6)	44.0 (34.0–54.8)	74.2 (64.8–82.5)	98.6 (98.4–98.7)	19.4 (15.3–23.3)	99.9 (99.8–99.9)
≥65	22 988	100	94.2 (91.7–96.3)	18.1 (12.7–27.4)	67.0 (57.3–75.8)	97.0 (96.8–97.2)	8.9 (7.0–11.0)	99.9 (99.8–99.9)
Sex
Male	16 836	143	95.0 (93.4–96.4)	32.7 (25.3–41.2)	71.3 (63.6–78.5)	96.1 (95.8–96.4)	13.6 (11.4–16.2)	99.7 (99.7–99.8)
Female	26 941	54	94.6 (90.8–97.8)	26.2 (15.3–39.0)	68.5 (55.3–80.9)	98.7 (98.6–98.9)	9.8 (7.0–13.1)	99.9 (99.9–100.0)
External validation
Age
18–65	10 634	41	95.1 (91.6–98.2)	33.1 (19.7–48.0)	63.4 (48.6–77.6)	98.8 (98.5–99.0)	16.5 (10.9–22.3)	99.9 (99.8–99.9)
≥65	9966	62	94.0 (91.4–96.2)	24.3 (14.7–37.0)	45.2 (32.3–57.6)	97.8 (97.5–98.1)	11.6 (7.8–15.8)	99.7 (99.5–99.8)
Sex
Male	8517	82	93.2 (90.7–95.5)	28.8 (19.5–39.4)	50.0 (39.4–60.5)	97.2 (96.8–97.5)	14.7 (10.7–19.2)	99.5 (99.3–99.7)
Female	12 083	21	94.5 (88.4–99.1)	20.9 (9.6–41.3)	61.9 (40.0–81.2)	99.1 (98.9–99.3)	10.7 (5.2–16.7)	99.9 (99.9–100.0)

AUROC, area under the receiver operating characteristic curve; AUPRC, area under the precision–recall curve; NPV, negative predictive value; PPV, positive predictive value; Sens., sensitivity; Spe., specificity.

Table 2

Open in new tab

Internal and external validation performance by subgroups

	n	n VT	AUROC (95% CI)	AUPRC (95% CI)	Sens. (95% CI)	Spe. (95% CI)	PPV (95% CI)	NPV (95% CI)
Internal validation
Age
18–65	20 789	97	97.1 (95.3–98.6)	44.0 (34.0–54.8)	74.2 (64.8–82.5)	98.6 (98.4–98.7)	19.4 (15.3–23.3)	99.9 (99.8–99.9)
≥65	22 988	100	94.2 (91.7–96.3)	18.1 (12.7–27.4)	67.0 (57.3–75.8)	97.0 (96.8–97.2)	8.9 (7.0–11.0)	99.9 (99.8–99.9)
Sex
Male	16 836	143	95.0 (93.4–96.4)	32.7 (25.3–41.2)	71.3 (63.6–78.5)	96.1 (95.8–96.4)	13.6 (11.4–16.2)	99.7 (99.7–99.8)
Female	26 941	54	94.6 (90.8–97.8)	26.2 (15.3–39.0)	68.5 (55.3–80.9)	98.7 (98.6–98.9)	9.8 (7.0–13.1)	99.9 (99.9–100.0)
External validation
Age
18–65	10 634	41	95.1 (91.6–98.2)	33.1 (19.7–48.0)	63.4 (48.6–77.6)	98.8 (98.5–99.0)	16.5 (10.9–22.3)	99.9 (99.8–99.9)
≥65	9966	62	94.0 (91.4–96.2)	24.3 (14.7–37.0)	45.2 (32.3–57.6)	97.8 (97.5–98.1)	11.6 (7.8–15.8)	99.7 (99.5–99.8)
Sex
Male	8517	82	93.2 (90.7–95.5)	28.8 (19.5–39.4)	50.0 (39.4–60.5)	97.2 (96.8–97.5)	14.7 (10.7–19.2)	99.5 (99.3–99.7)
Female	12 083	21	94.5 (88.4–99.1)	20.9 (9.6–41.3)	61.9 (40.0–81.2)	99.1 (98.9–99.3)	10.7 (5.2–16.7)	99.9 (99.9–100.0)

	n	n VT	AUROC (95% CI)	AUPRC (95% CI)	Sens. (95% CI)	Spe. (95% CI)	PPV (95% CI)	NPV (95% CI)
Internal validation
Age
18–65	20 789	97	97.1 (95.3–98.6)	44.0 (34.0–54.8)	74.2 (64.8–82.5)	98.6 (98.4–98.7)	19.4 (15.3–23.3)	99.9 (99.8–99.9)
≥65	22 988	100	94.2 (91.7–96.3)	18.1 (12.7–27.4)	67.0 (57.3–75.8)	97.0 (96.8–97.2)	8.9 (7.0–11.0)	99.9 (99.8–99.9)
Sex
Male	16 836	143	95.0 (93.4–96.4)	32.7 (25.3–41.2)	71.3 (63.6–78.5)	96.1 (95.8–96.4)	13.6 (11.4–16.2)	99.7 (99.7–99.8)
Female	26 941	54	94.6 (90.8–97.8)	26.2 (15.3–39.0)	68.5 (55.3–80.9)	98.7 (98.6–98.9)	9.8 (7.0–13.1)	99.9 (99.9–100.0)
External validation
Age
18–65	10 634	41	95.1 (91.6–98.2)	33.1 (19.7–48.0)	63.4 (48.6–77.6)	98.8 (98.5–99.0)	16.5 (10.9–22.3)	99.9 (99.8–99.9)
≥65	9966	62	94.0 (91.4–96.2)	24.3 (14.7–37.0)	45.2 (32.3–57.6)	97.8 (97.5–98.1)	11.6 (7.8–15.8)	99.7 (99.5–99.8)
Sex
Male	8517	82	93.2 (90.7–95.5)	28.8 (19.5–39.4)	50.0 (39.4–60.5)	97.2 (96.8–97.5)	14.7 (10.7–19.2)	99.5 (99.3–99.7)
Female	12 083	21	94.5 (88.4–99.1)	20.9 (9.6–41.3)	61.9 (40.0–81.2)	99.1 (98.9–99.3)	10.7 (5.2–16.7)	99.9 (99.9–100.0)

Table 3

Open in new tab

Distribution of the longest ventricular tachycardia duration and the maximum ventricular tachycardia rate among positive recordings in each data set

	Number of recordings (%)
	Training	Internal validation	External validation
VT duration (s)
30–60	328 (40.8)	78 (39.6)	42 (40.8)
60–240	296 (36.8)	77 (39.1)	35 (34.0)
240–600	82 (10.2)	22 (11.2)	7 (6.8)
≥600	98 (12.2)	20 (10.2)	19 (18.4)
VT rate (b.p.m.)
100–150	355 (44.2)	82 (41.6)	46 (44.7)
150–180	210 (26.1)	58 (29.4)	20 (19.4)
≥180	239 (29.7)	57 (28.9)	37 (35.9)

	Number of recordings (%)
	Training	Internal validation	External validation
VT duration (s)
30–60	328 (40.8)	78 (39.6)	42 (40.8)
60–240	296 (36.8)	77 (39.1)	35 (34.0)
240–600	82 (10.2)	22 (11.2)	7 (6.8)
≥600	98 (12.2)	20 (10.2)	19 (18.4)
VT rate (b.p.m.)
100–150	355 (44.2)	82 (41.6)	46 (44.7)
150–180	210 (26.1)	58 (29.4)	20 (19.4)
≥180	239 (29.7)	57 (28.9)	37 (35.9)

VT, ventricular tachycardia.

Table 3

Open in new tab

Distribution of the longest ventricular tachycardia duration and the maximum ventricular tachycardia rate among positive recordings in each data set

	Number of recordings (%)
	Training	Internal validation	External validation
VT duration (s)
30–60	328 (40.8)	78 (39.6)	42 (40.8)
60–240	296 (36.8)	77 (39.1)	35 (34.0)
240–600	82 (10.2)	22 (11.2)	7 (6.8)
≥600	98 (12.2)	20 (10.2)	19 (18.4)
VT rate (b.p.m.)
100–150	355 (44.2)	82 (41.6)	46 (44.7)
150–180	210 (26.1)	58 (29.4)	20 (19.4)
≥180	239 (29.7)	57 (28.9)	37 (35.9)

	Number of recordings (%)
	Training	Internal validation	External validation
VT duration (s)
30–60	328 (40.8)	78 (39.6)	42 (40.8)
60–240	296 (36.8)	77 (39.1)	35 (34.0)
240–600	82 (10.2)	22 (11.2)	7 (6.8)
≥600	98 (12.2)	20 (10.2)	19 (18.4)
VT rate (b.p.m.)
100–150	355 (44.2)	82 (41.6)	46 (44.7)
150–180	210 (26.1)	58 (29.4)	20 (19.4)
≥180	239 (29.7)	57 (28.9)	37 (35.9)

VT, ventricular tachycardia.

To understand which features contributed to the prediction of VT by the model, we generated saliency maps to highlight regions of the HRDP and ECG signal with a strong impact on the model’s prediction. In the HRDP, we confirmed that PVC burden is a key predictor of VT (Figure 3A).¹⁴ Among ECGs in sinus rhythm, we observed three localizations of the signal commonly highlighted, which include the region before the onset of the QRS, the first slope of the QRS, and along the ST segment (Figure 3B). Additional saliency maps are provided in Supplementary data online, Figures S2 and S3.

Figure 3

Saliency maps. (A) Gradient-weighted class activation mapping saliency map overlaid on a heart rate density plot of a true positive. The plot shows sinus (black) and ventricular (red) beats from a 12 h recording. Regions highlighted in red signify higher importance. (B) Saliency map computed using integrated gradients overlaid on an electrocardiogram signal of a true-positive recording in sinus rhythm

Open in new tab Download slide

Discussion

In this study, we developed and validated a novel deep learning–based model to predict near-term VT using an ambulatory ECG. This model, trained using a large volume of ambulatory records, showed robust performance in both internal and external validation data sets. The performance on the external validation set indicates that this model may generalize to patient populations not encountered during training. Furthermore, using saliency mapping, in addition to the importance of PVC burden, we identified QRS fragmentation, the first slope of QRS and the region before the onset of the QRS to be potential determinants of VT risk in the model (Structured Graphical Abstract ). These findings have significant implications for developing a ‘near-term’ prevention novel approach for SCD.^1,3,24

Given the limitations of the current strategy of mid- and long-term SCD prevention based on risk stratification of patients with underlying heart disease, it has become increasingly important to explore alternative approaches. However, no prediction tool is currently used in clinical practice, especially for short-term horizons. While the management of patients flagged as high risk for VT remains uncertain, an interventional approach based on VT prediction has yet to be established and validated through randomized trials. A recent randomized study²⁵ demonstrated a short-term mortality benefit from using an AI-ECG model capable of identifying patients at high risk of mortality, which led to more intensive surveillance, diagnostic examinations, and therapeutic actions. Moreover, in this study, cardiac mortality, including arrhythmias, was significantly lower in the intervention group. As we acknowledge the relatively low PPV reported in Table 2 with a fixed sensitivity and specificity, which is partly due to the relatively low prevalence of VT in our study population, the model’s sensitivity and specificity can be adjusted based on its intended use, depending on whether a high PPV or high NPV is preferred (Figure 2). Our model may have numerous applications across different clinical settings. In the outpatient setting, patients monitored with mobile cardiac telemetry could benefit from a triggered alert preceding the onset of a life-threatening arrhythmia, allowing pre-emptive actions. During hospitalization, fatal events may be prevented hours to days before they occur with a new AI-based ‘smart-monitoring’ system. The performance of this model using a single-lead ECG also paves the way for its integration with smartwatches or implantable loop recorders, enabling remote patient monitoring and pre-emptive interventions. Recent work has demonstrated the ability of deep learning to detect numerous cardiovascular diseases, including valvular heart disease, hypertrophic cardiomyopathy, future atrial fibrillation, and also cardiac arrest from the ECG.^{11,12,26–28} The performance obtained in our study could be explained by the ambulatory ECG containing both structural and temporal information reflecting the complexity of the interactions between the autonomic nervous system, substrate, and triggers. Although this study focuses on near-term prediction, those elements of the ECG captured by the algorithm may potentially also be used to improve longer term risk stratification and prevention, although this concept needs further testing.

Additionally, we analysed saliency maps of positive ambulatory recordings, acknowledging their hypothesis-generating nature, to explore whether certain features were more influential in the model’s predictions of VT risk. In the HRDP, we confirmed that salient regions were generally focused on dense regions of PVCs (Figure 3A and Supplementary data online, Figure S2). On ECG waveform analysis during sinus rhythm, our findings suggest that the beginning of the QRS may have a strong impact on the prediction. A QRS that has either a slow fragmented slope or an early depolarization pattern, identified as a low-voltage fragmented wave occurring during the 40 ms prior to the apparent onset of the QRS, was commonly encountered in ambulatory recordings with VT occurrence (Figure 3B and Supplementary data online, Figure S3). One hypothesis could be that this pattern reflects abnormal conduction in the myocardium, through scaring near the Purkinje system. While we recognize the importance of developing mechanistic insights from the AI-based prediction of VT, we must be cautious in interpreting these saliency maps. It is still early, and drawing premature conclusions about the underlying factors identified by the AI model should be avoided. Future studies should focus on carefully analysing the specific parameters that the model uses to make these predictions. Premature ventricular complex burden and traditional features are recognized as classical potential predictors of sustained VT.² However, no widely accepted quantitative VT risk score currently exists based on Holter ECG metrics. In our study, we demonstrated superior performance with an AI model that utilizes the entire raw ECG signal, compared with a logistic regression model built with classical variables.

Although this is one of the first evaluations of a deep learning–based approach to predict sustained VT using the largest database of ambulatory ECGs, we acknowledge several limitations. Firstly, we lacked associated clinical and race/ethnicity data; however, given the large amount of international data used here without filtering or selection, it is likely that a wide variety of pathologies and patients with and without heart disease were included. Our objective was to demonstrate the feasibility of an AI model in predicting ventricular arrhythmias purely based on electrical data (the intrinsic value of isolated electrical signal analysis, beyond the underlying cardiac substrate), without the added value of additional clinical information. It would be interesting for future studies to explore how different AI-based tools align as new models are developed. For now, this study remains focused on demonstrating proof of concept—that an AI-based tool can effectively predict VT in the short term. It is also important to note that the model’s predictions are based solely on the raw ECG signal, with the exception of age and sex, and do not require any preliminary analysis by humans or external software. However, further studies are needed to assess how the model performs across specific patient populations and clinical profiles. Secondly, the number of positive ambulatory recordings in our external validation set was limited (n = 103); however, to the best of our knowledge, it is one of the largest reported databases of sustained VT on ambulatory ECG recordings. Thirdly, this study is a retrospective analysis involving previously collected ambulatory recordings and will need to be further assessed in prospective studies before translation to the clinical arena. Retrospective analysis can introduce selection bias and may not fully capture temporal relationships, limiting the generalizability of our findings; thus, validation in well-designed prospective trials is essential to confirm also the feasibility of such a concept. Additionally, we acknowledge that sustained VT does not necessarily lead to SCD, and we lack complete information on the final clinical outcomes of the patients in our cohort, except for 10 cases where VT progressed to VF. The haemodynamic tolerance of VT depends on various factors, such as heart rate and the presence of underlying heart disease. However, even slower VTs can result in fatal heart failure or escalate to VF. Given this potential deterioration into VF, sustained VT is recognized in clinical guidelines^14,15 as a serious condition that often warrants ICD implantation to prevent SCD. Notably, the model accurately identified 9 out of the 10 recordings where VT degenerated into VF within the validation set. Furthermore, the exclusion of recordings shorter than 13 days and without VT for model testing leads to an increase in disease prevalence, which may enhance certain metrics (PPV, AUPRC). However, sensitivity and specificity should theoretically remain unaffected. For prediction horizons of <13 days (Figure 2C), including these shorter recordings could potentially alter some performance metrics. Finally, despite the identification of well-known and novel determinants of VT risk on ECG through saliency mapping, we recognize that this method has its limitations. Notably, the locations revealed in the ECG waveform should not be considered exhaustive, and the link between the highlighted locations and the exact pathophysiology should be regarded as hypothesis generating.²⁹

Conclusions

Using a large cohort of patients, we developed and validated a novel deep learning–based model to predict near-term risk of sustained ventricular arrhythmias from a single-lead ambulatory ECG. This tool could potentially lay the foundation for a new approach towards SCD risk management and improve patient outcomes.

Supplementary data

Supplementary data are available at European Heart Journal online.

Declarations

Disclosure of Interest

L.F. is a medical expert at Cardiologs, a Philips company. T.C., J.L., and C.H. are employed at Cardiologs. J.P.S. is a consultant for Abbott Inc., Boston Scientific, Biotronik, Biosense Webster, Cardiologs Inc., CVRx Inc., EBR Systems Inc., Implicity Inc., Impulse Dynamics, Rhythm Management Group, Medtronic Inc., Sanofi Inc., and WebMD. K.N. has no competing interests. E.M. is a consultant for Boston Scientific, Medtronic, Abbott, and Zoll, and received research grants from Abbott, Boston Scientific, Medtronic, Microport, and Biotronik.

Data Availability

All the data collected for the study cannot be publicly released due to European regulations and the requirement of permission from the original data owners for research purposes. However, if you have academic inquiries, you can reach out to the authors (E.M., Université Paris Cité: [email protected]) to obtain access to de-identified data through a Data Transfer Agreement procedure. Please note that the AI algorithm used in the study is patented, and we do not have the rights to share it. If you are interested in using the algorithm for an academic project, you can contact the authors (L.F.: [email protected] or T.C.: [email protected]) to discuss the possibility of applying for an agreement procedure with Cardiologs/Philips. The Python code used to generate the data and perform analyses is available at https://github.com/carbonati/predict-vt.

Funding

Cardiologs enabled the collection of the database and enabled the provision of the human resources necessary for data management and the development of the algorithm and its validation.

Ethical Approval

The study protocol was approved by the local Institutional Review Board, and the need for individual informed consent was waived.

Pre-registered Clinical Trial Number

Not applicable. Since this is a retrospective and non-interventional study, there was no pre-registered clinical trial number.

References

Marijon

Narayanan

Smith

Barra

Basso

Blom

, et al.

The Lancet Commission to reduce the global burden of sudden cardiac death: a call for multidisciplinary action

Lancet

2023

;

402

883

–

936

10.1016/S0140-6736(23)00875-9

Lown

Wolf

Approaches to sudden death from coronary heart disease

Circulation

1971

;

130

–

10.1161/01.CIR.44.1.130

Marijon

Garcia

Narayanan

Karam

Jouven

Fighting against sudden cardiac death: need for a paradigm shift-adding near-term prevention and pre-emptive action to long-term prevention

Eur Heart J

2022

;

1457

–

10.1093/eurheartj/ehab903

Sasson

Rogers

MAM

Dahl

Kellermann

Predictors of survival from out-of-hospital cardiac arrest

Circ Cardiovasc Qual Outcomes

2010

;

–

10.1161/CIRCOUTCOMES.109.889576

Goldberger

Cain

Hohnloser

Kadish

Knight

Lauer

, et al.

American Heart Association/American College of Cardiology Foundation/Heart Rhythm Society scientific statement on noninvasive risk stratification techniques for identifying patients at risk for sudden cardiac death. A scientific statement from the American Heart Association Council on Clinical Cardiology Committee on Electrocardiography and Arrhythmias and Council on Epidemiology and Prevention

J Am Coll Cardiol

2008

;

1179

–

10.1016/j.jacc.2008.05.003

Trayanova

Topol

Deep learning a person’s risk of sudden cardiac death

Lancet

2022

;

399

1933

10.1016/S0140-6736(22)00881-9

Stecker

Vickers

Waltz

Socoteanu

John

Mariani

, et al.

Population-based analysis of sudden cardiac death with and without left ventricular systolic dysfunction: two-year findings from the Oregon Sudden Unexpected Death Study

J Am Coll Cardiol

2006

;

1161

–

10.1016/j.jacc.2005.11.045

Coumel

The management of clinical arrhythmias. An overview on invasive versus non-invasive electrophysiology

Eur Heart J

1987

;

–

10.1093/oxfordjournals.eurheartj.a062259

Shen

Zipes

Role of the autonomic nervous system in modulating cardiac arrhythmias

Circ Res

2014

;

114

1004

–

10.1161/CIRCRESAHA.113.302549

LeCun

Bengio

Hinton

Deep learning

Nature

2015

;

521

436

–

Attia

Noseworthy

Lopez-Jimenez

Asirvatham

Deshmukh

Gersh

, et al.

An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction

Lancet

2019

;

394

861

–

10.1016/S0140-6736(19)31721-0

Singh

Fontanarava

de Massé

Carbonati

Henry

, et al.

Short-term prediction of atrial fibrillation from ambulatory monitoring ECG using a deep neural network

Eur Heart J Digit Health

2022

;

208

–

10.1093/ehjdh/ztac014

Savage

Breaking into the black box of artificial intelligence

Nature

2022

10.1038/d41586-022-00858-1

Google Scholar

OpenURL Placeholder Text

WorldCat

Crossref

Zeppenfeld

Tfelt-Hansen

de Riva

Winkel

Behr

Blom

, et al.

2022 ESC guidelines for the management of patients with ventricular arrhythmias and the prevention of sudden cardiac death

Eur Heart J

2022

;

3997

–

4126

10.1093/eurheartj/ehac262

Al-Khatib

Stevenson

Ackerman

Bryant

Callans

Curtis

, et al.

2017 AHA/ACC/HRS guideline for management of patients with ventricular arrhythmias and the prevention of sudden cardiac death

Circulation

2018

;

138

e272

–

391

10.1161/CIR.0000000000000549

Fiorina

Maupain

Gardella

Manenti

Salerno

Socie

, et al.

Evaluation of an ambulatory ECG analysis platform using deep neural networks in routine clinical practice

J Am Heart Assoc

2022

;

e026196

10.1161/JAHA.122.026196

Zhang

Ren

Sun

Deep residual learning for image recognition. In:

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Las Vegas

IEEE

2016

770

–

Ilse

Tomczak

Welling

Attention-based deep multiple instance learning

arXiv

10.48550/arXiv.1802.04712

, 13 February 2018, preprint: not peer reviewed.

OpenURL Placeholder Text

WorldCat

Crossref

Vaswani

Shazeer

Parmar

Uszkoreit

Jones

Gomez

, et al. Attention is all you need. In:

Guyon

Luxburg

Bengio

Wallach

Fergus

Vishwanathan

(eds.),

Advances in Neural Information Processing Systems

Curran Associates, Inc.

2017

5998

–

6008

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Zhang

Zhu

Zhang

Lin

, et al.

ResNeSt: split-attention networks

arXiv

10.48550/arXiv.2004.08955

, 19 April 2020, preprint: not peer reviewed.

OpenURL Placeholder Text

WorldCat

Crossref

Selvaraju

Cogswell

Das

Vedantam

Parikh

Batra

. Grad-CAM: visual explanations from deep networks via gradient-based localization. In:

2017 IEEE International Conference on Computer Vision (ICCV)

Venice, Italy

IEEE

2017

618

–

Sundararajan

Taly

Yan

Axiomatic attribution for deep networks

arXiv

10.48550/arXiv.1703.01365

, 4 March 2017, preprint: not peer reviewed.

OpenURL Placeholder Text

WorldCat

Crossref

DeLong

Clarke-Pearson

Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach

Biometrics

1988

;

837

–

Marijon

Uy-Evanado

Dumas

Karam

Reinier

Teodorescu

, et al.

Warning symptoms are associated with survival from sudden cardiac arrest

Ann Intern Med

2016

;

164

–

Lin

C-S

Liu

W-T

Tsai

D-J

Lou

Y-S

Chang

C-H

Lee

C-C

, et al.

AI-enabled electrocardiography alert intervention and all-cause mortality: a pragmatic randomized clinical trial

Nat Med

2024

;

1461

–

10.1038/s41591-024-02961-4

Cohen-Shelly

Attia

Friedman

Ito

Essayagh

, et al.

Electrocardiogram screening for aortic valve stenosis using artificial intelligence

Eur Heart J

2021

;

2885

–

10.1093/eurheartj/ehab153

W-Y

Siontis

Attia

Carter

Kapa

Ommen

, et al.

Detection of hypertrophic cardiomyopathy using a convolutional neural network-enabled electrocardiogram

J Am Coll Cardiol

2020

;

722

–

10.1016/j.jacc.2019.12.030

Kwon

J-M

Kim

K-H

Jeon

K-H

Lee

Park

B-H

Artificial intelligence algorithm for predicting cardiac arrest using electrocardiography

Scand J Trauma Resusc Emerg Med

2020

;

10.1186/s13049-020-00791-0

Arun

Gaw

Singh

Chang

Aggarwal

Chen

, et al.

Assessing the trustworthiness of saliency maps for localizing abnormalities in medical imaging

Radiol Artif Intell

2021

;

e200267

10.1148/ryai.2021200267

Author notes

Laurent Fiorina and Tanner Carbonati contributed equally to this work.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].

Download all slides

Month:	Total Views:
March 2025	1,136
April 2025	3,725
May 2025	513

Article Contents

Near-term prediction of sustained ventricular arrhythmias applying artificial intelligence to single-lead ambulatory electrocardiogram

Abstract

Introduction

Methods

Data sources and study setting

Outcome definition

Deep learning–based model

Secondary analyses

Interpretability

Statistical analysis

Results

Discussion

Conclusions

Supplementary data

Declarations

Disclosure of Interest

Data Availability

Funding

Ethical Approval

Pre-registered Clinical Trial Number

References

Author notes

Supplementary data

Citations

Views

Altmetric

Email alerts

See also

Companion Article

More on this topic

Related articles in PubMed

Citing articles via

Latest

Most Read

Most Cited

Looking for your next opportunity?

Article Contents

Near-term prediction of sustained ventricular arrhythmias applying artificial intelligence to single-lead ambulatory electrocardiogram Open Access

Abstract

Introduction

Methods

Data sources and study setting

Outcome definition

Deep learning–based model

Secondary analyses

Interpretability

Statistical analysis

Results

Discussion

Conclusions

Supplementary data

Declarations

Disclosure of Interest

Data Availability

Funding

Ethical Approval

Pre-registered Clinical Trial Number

References

Author notes

Supplementary data

Citations

Views

Altmetric

Email alerts

See also

Companion Article

More on this topic

Related articles in PubMed

Citing articles via

Latest

Most Read

Most Cited

Looking for your next opportunity?

This Feature Is Available To Subscribers Only

Near-term prediction of sustained ventricular arrhythmias applying artificial intelligence to single-lead ambulatory electrocardiogram