-
PDF
- Split View
-
Views
-
Cite
Cite
Yonathan Schwammenthal, Tom Rabinowitz, Lina Basel-Salmon, Reut Tomashov-Matar, Noam Shomron, Noninvasive fetal genotyping using deep neural networks, Briefings in Bioinformatics, Volume 26, Issue 1, January 2025, bbaf067, https://doi.org/10.1093/bib/bbaf067
- Share Icon Share
Abstract
Circulating cell-free DNA (cfDNA) is a powerful diagnostics tool that is widely studied in the context of liquid biopsy in oncology and other fields. In obstetrics, maternal plasma cfDNA have already proven its utility, enabling noninvasive prenatal testing (NIPT), which has become a standard for detecting chromosomal aberrations. However, identification of point mutations responsible for monogenic diseases (NIPT-M) remains limited, even when accounting to fragment specific characteristics (i.e. fragmentomics). While genotyping of individual genomes is performed today using deep learning (DL) algorithms, cfDNA-based noninvasive fetal genotyping is performed only using traditional statistical and machine-learning methods. This study introduces the first DL-based framework for cfDNA based genotyping, heralding a significant stride toward genome-wide NIPT-M. Using unique ultra-deep whole genome sequencing (WGS) data, we were motivated to develop an efficient model, especially when compared with current DL methods for WGS. This facilitates the integration of previously overlooked levels of information, encompassing DNA nucleotides, fragments, mutation regions, samples, and familial traits. Employing this novel approach, we surpass the performance of existing methodologies, successfully detecting three deleterious mutations, and allowing for NIPT-M as early as the 7th week of gestation. Our proposed approach brings genome-wide NIPT for all mutation types closer to clinical feasibility, enabling families and healthcare providers to make well-informed decisions and alleviating the anxieties and uncertainties associated with pregnancy.
Introduction
Noninvasive prenatal testing (NIPT), utilizing cell-free fetal DNA (cfDNA), has transformed prenatal diagnostics by predicting genetic variants inherited by the fetus using maternal blood samples. NIPT is adept at detecting chromosomal and large sub-chromosomal disorders [1–3], and has been extended to detect monogenic disease mutations (NIPT-M) [4, 5]. Nonetheless, its application is currently limited to particular mutations or genes, as it relies on tailored assays that cannot provide genome-wide analysis [2, 6].
Several experimental methods for genome-wide NIPT-M have been developed [7–10], including notable recent efforts to utilize deep whole-exome sequencing (WES) to overcome this challenge [11, 12]. These methods are typically accurate in late pregnancy stages, with extremely high fetal cfDNA fractions (FF) (median 29%) [12], yet actionable results are more feasible at early stages. Previously we have introduced Hoobari, which was the first to utilize differences between maternal and fetal cfDNA fragments, enabling improved performance particularly for maternal inheritance [6, 13, 14]. Based on a Bayesian algorithm, it adheres to the fundamental principles of genomic variant analysis, and it remained the only bioinformatic tool for NIPT-M to date. Nevertheless, similar to prior methods, it is limited by low FF, as well as in cases of a shared parental mutation. The lower FF in the first trimester hinders accurate prediction due to maternal background noise, with increased sequencing depth offering limited improvement as it imposes amplification biases. Consequently, effective genome-wide NIPT-M solutions are still forthcoming.
The integration of artificial intelligence (AI), particularly deep learning (DL), into biosciences has yielded significant tools such as AlphaFold for protein structure prediction [15]. In genomic variant analysis, the common practice has shifted from Bayesian and subsequent ML techniques [16] to comprehensive DL methodologies, as demonstrated by Google’s DeepVariant [17]. Its effectiveness is most pronounced in addressing challenging scenarios, such as small insertions-deletions (indels), noisy WES data, and low coverage whole genome sequencing (WGS), and unlike traditional models that require manual customization for each data type, it can learn from the inherent noise and biases present in its training datasets. Previously, we suggested to redefine genome-wide NIPT-M as a unique case of variant analysis, hoping this would create further exploratory directions. Motivated by the successful application of DL to variant analysis, we focus our effort on applying neural networks to noninvasive fetal genotyping, evolving our Hoobari method with DL advancements.
In this study, we introduce deepHoobari, the first DL-based method for NIPT-M. DeepHoobari functions similarly to its predecessor, Hoobari, as a variant-calling software that produces a Variant Calling Format (VCF) file, containing a list of genomic variants and predicted genotypes. However, instead of using a naïve Bayes model, it calculates genotype probabilities using a deep learning (DL) model. This methodology was developed using a labeled dataset from 10 family trios with WGS, a significantly larger dataset than those used in prior research. The dataset comprises genomic DNA from parents, maternal plasma, and a fetal sample obtained via invasive means, serving as the ground truth for the analysis. Maternal plasma cfDNA was sequenced to ultra-deep coverage. The volume of data necessitated the creation of more efficient tensors and models compared to previous AI genomics methods, since NIPT-M requires the integration of data from hundreds of DNA fragments. This model also pioneers the incorporation of fragmentomics research to discern cfDNA fragment sources. Hence, our method advances beyond prior DL genomics techniques by combining manual feature crafting with the pattern recognition capabilities of AI.
Our approach achieves state-of-the-art results for genome-wide NIPT-M, enabling noninvasive genotyping from the 7th week of gestation. It accurately identified three harmful mutations associated with monogenic disorders in tested samples, outperforming our prior technique that succeeded in only one instance. This encompassed the detection of both heterozygous and homozygous mutations related to Infantile Convulsions and Paroxysmal Choreoathetosis (ICCA) Syndrome and Glutaric Acidemia Type IIC, respectively. Our findings suggest that AI is capable to bridge the gap between current sequencing technologies and the precision necessary for clinical application. This paper outlines our study and the key aspects of our methodology, underscoring its innovative contributions to NIPT-M.
Results
In order to train and test our algorithm, we collected and sequenced samples from a total of 10 family trios. For each trio, the parental genomic DNA samples and the plasma DNA samples were used for genotype prediction, while another sample that was invasively acquired was used as ground truth. Once genetic information was gathered, the next step was organizing it for training and evaluation. This involved splitting the data into training, validation, and test sets, as commonly practiced when developing deep learning models (Supplementary Table S1, Fig. 1). To prevent overfitting of the model to our data, multiple precautions were used (see Supplementary Material).

Study design. Upper panel: (A) family trios are recruited to the study. (B) Paternal (black) and maternal (red) genomic DNA is extracted from peripheral blood mononuclear cells (PBMCs); fetal (white) and maternal cfDNA is extracted from maternal plasma; fetal genomic DNA is collected through an invasive procedure (amniocentesis or CVS). (C) Parental and fetal genomic DNA as well as feto-maternal cfDNA are mapped to the human reference genome, and candidate variants are called by comparison of mapped DNA fragments to the human reference genome. (D) Sequenced cfDNA data is analyzed and genotyped using a neural network, and (E) deleterious mutations are reported. Lower panel: The complete family trio dataset was split into training (purple), validation (light blue), and test set (green). The model improves during an iterative training, both through the learning process and by optimizations introduced following the validation results. Eventually, the test set variants are compared to the labels from the fetal genomic DNA.
Variants were represented as tensors consisting of various features from the maternal, paternal and plasma DNA fragments covering the relevant genomic location, as well as sample-level information. Features were calculated using a combination of in-house code and off-the-shelf software tools (see Materials and Methods). Each variant tensor was labeled based on the predicated genotype in the invasively acquired sample. A neural network was trained and validated on the data multiple times, each followed by refinement of the data representation, hyper-parameters, and the model itself. This was performed iteratively until a final model was selected based on performance in both the training and validation sets. The model was then evaluated once using the test set.
The fetal variants were categorized into three potential labels: 0/0, 0/1, and 1/1, corresponding to the different allele combinations, i.e. homozygous to the human reference genome allele at the position, heterozygous, and homozygous to the alternate allele. This makes the variant prediction a classification task with three classes. To further simplify, we reduced the number of classes from three to two, resulting in binary classification (Supplementary Table S2).
Genome-wide NIPT-M performance
Our evaluation employed the area under the ROC curve (AUC-ROC), and the average precision (AP) of the precision-recall curves, across different parental genotype categories and families. To compare the proposed model and previous benchmarks, AUC was calculated in the same manner as in previous studies [9, 13], which is by binarizing the fetal genotypes (Supplementary Table S2).
We compared the results of our model to the Bayesian method in three inheritance categories, according to the heterozygosity of the variants in the parental DNA: paternal-heterozygous mutations, in which the father is heterozygous and the mother is homozygous; maternal-heterozygous mutations, where the mother is heterozygous and the father is homozygous; and biparental, i.e. dual heterozygous mutations, where both parents are heterozygous (Table 1, Fig. 2). Remarkably high AUC values (0.996–0.999) were achieved for paternally transmitted variants. Maternal variants followed, exhibiting AUC values of 0.715–0.878. Particularly intriguing predictions were seen in biparental variants, with AUC values ranging between 0.656 and 0.842, highlighting the challenge in predicting such instances, yet showing promising results.
. | . | . | AUC-ROC . | AP . | ||
---|---|---|---|---|---|---|
Family . | FF . | Inheritance . | Hoobari . | DeepHoobari . | Hoobari . | DeepHoobari . |
TST01 (7 weeks gestation) | 8.90% | Paternal | 0.9760 | 0.9994 | 0.8461 | 0.9939 |
Maternal | 0.7738 | 0.8025 | 0.6646 | 0.7358 | ||
Both parents | 0.5749 | 0.7189 | 0.4299 | 0.6356 | ||
TST01 (19 weeks gestation) | 10.44% | Paternal | 0.9884 | 0.9997 | 0.8145 | 0.9882 |
Maternal | 0.8750 | 0.8778 | 0.6266 | 0.6855 | ||
Both parents | 0.6521 | 0.8424 | 0.4175 | 0.6153 | ||
TST02 | 7.02% | Paternal | 0.9674 | 0.9963 | 0.8616 | 0.9884 |
Maternal | 0.6817 | 0.7153 | 0.6549 | 0.7153 | ||
Both parents | 0.5102 | 0.6562 | 0.4176 | 0.5563 | ||
TST03 | 15.18% | Paternal | 0.9698 | 0.9976 | 0.8372 | 0.9973 |
Maternal | 0.7299 | 0.7601 | 0.8480 | 0.8749 | ||
Both parents | 0.5218 | 0.7105 | 0.5523 | 0.8066 |
. | . | . | AUC-ROC . | AP . | ||
---|---|---|---|---|---|---|
Family . | FF . | Inheritance . | Hoobari . | DeepHoobari . | Hoobari . | DeepHoobari . |
TST01 (7 weeks gestation) | 8.90% | Paternal | 0.9760 | 0.9994 | 0.8461 | 0.9939 |
Maternal | 0.7738 | 0.8025 | 0.6646 | 0.7358 | ||
Both parents | 0.5749 | 0.7189 | 0.4299 | 0.6356 | ||
TST01 (19 weeks gestation) | 10.44% | Paternal | 0.9884 | 0.9997 | 0.8145 | 0.9882 |
Maternal | 0.8750 | 0.8778 | 0.6266 | 0.6855 | ||
Both parents | 0.6521 | 0.8424 | 0.4175 | 0.6153 | ||
TST02 | 7.02% | Paternal | 0.9674 | 0.9963 | 0.8616 | 0.9884 |
Maternal | 0.6817 | 0.7153 | 0.6549 | 0.7153 | ||
Both parents | 0.5102 | 0.6562 | 0.4176 | 0.5563 | ||
TST03 | 15.18% | Paternal | 0.9698 | 0.9976 | 0.8372 | 0.9973 |
Maternal | 0.7299 | 0.7601 | 0.8480 | 0.8749 | ||
Both parents | 0.5218 | 0.7105 | 0.5523 | 0.8066 |
. | . | . | AUC-ROC . | AP . | ||
---|---|---|---|---|---|---|
Family . | FF . | Inheritance . | Hoobari . | DeepHoobari . | Hoobari . | DeepHoobari . |
TST01 (7 weeks gestation) | 8.90% | Paternal | 0.9760 | 0.9994 | 0.8461 | 0.9939 |
Maternal | 0.7738 | 0.8025 | 0.6646 | 0.7358 | ||
Both parents | 0.5749 | 0.7189 | 0.4299 | 0.6356 | ||
TST01 (19 weeks gestation) | 10.44% | Paternal | 0.9884 | 0.9997 | 0.8145 | 0.9882 |
Maternal | 0.8750 | 0.8778 | 0.6266 | 0.6855 | ||
Both parents | 0.6521 | 0.8424 | 0.4175 | 0.6153 | ||
TST02 | 7.02% | Paternal | 0.9674 | 0.9963 | 0.8616 | 0.9884 |
Maternal | 0.6817 | 0.7153 | 0.6549 | 0.7153 | ||
Both parents | 0.5102 | 0.6562 | 0.4176 | 0.5563 | ||
TST03 | 15.18% | Paternal | 0.9698 | 0.9976 | 0.8372 | 0.9973 |
Maternal | 0.7299 | 0.7601 | 0.8480 | 0.8749 | ||
Both parents | 0.5218 | 0.7105 | 0.5523 | 0.8066 |
. | . | . | AUC-ROC . | AP . | ||
---|---|---|---|---|---|---|
Family . | FF . | Inheritance . | Hoobari . | DeepHoobari . | Hoobari . | DeepHoobari . |
TST01 (7 weeks gestation) | 8.90% | Paternal | 0.9760 | 0.9994 | 0.8461 | 0.9939 |
Maternal | 0.7738 | 0.8025 | 0.6646 | 0.7358 | ||
Both parents | 0.5749 | 0.7189 | 0.4299 | 0.6356 | ||
TST01 (19 weeks gestation) | 10.44% | Paternal | 0.9884 | 0.9997 | 0.8145 | 0.9882 |
Maternal | 0.8750 | 0.8778 | 0.6266 | 0.6855 | ||
Both parents | 0.6521 | 0.8424 | 0.4175 | 0.6153 | ||
TST02 | 7.02% | Paternal | 0.9674 | 0.9963 | 0.8616 | 0.9884 |
Maternal | 0.6817 | 0.7153 | 0.6549 | 0.7153 | ||
Both parents | 0.5102 | 0.6562 | 0.4176 | 0.5563 | ||
TST03 | 15.18% | Paternal | 0.9698 | 0.9976 | 0.8372 | 0.9973 |
Maternal | 0.7299 | 0.7601 | 0.8480 | 0.8749 | ||
Both parents | 0.5218 | 0.7105 | 0.5523 | 0.8066 |

ROC curves (A–C) and precision-recall (PR) curves (E–F) of the test families, Hoobari (Bayesian model) versus DeepHoobari (DL model) comparison. AUC corresponds to area under the curve, and AR is the AP. Paternal, maternal and Biparental correspond to the heterozygosity of the parental DNA; where either the father, mother or both parents are heterozygous. Curves shown describe the average (bold line) and range (transparent color) of the ROC curves of the four test families, for Hoobari (red) and deepHoobari (blue). The lower end of the range corresponds to families with low FF, and the higher end corresponds to the high FF families.
Figure 2D–F further illustrates these trends in performance. The difference in AP between panels D and E can be attributed to the model’s improved ability to distinguish fetal cfDNA from background maternal DNA in paternal-heterozygous mutations, where the fetal signal is more distinct and less confounded by maternal sequences. In contrast, biparental variants remain the most challenging due to overlapping signals from dual heterozygous mutations, which complicate precise predictions. Nevertheless, the DL model demonstrated clear advantages over the Bayesian method even in these difficult cases, underscoring its robustness across all inheritance scenarios.
Notably, the above comparison demonstrates the capability of the DL-based approach to surpass the previous methods across all variant categories, establishing highly accurate results in diverse scenarios. The model demonstrates a significant improvement in accurate genotyping of biparental cases, ranging from 0.14 to 0.19 AUC points. To emphasize, biparental variants are typically overlooked in previous studies, due to their difficult analysis.
A discrepancy in results within family TST01 was noticed when data sampled in week 19 was compared to week 7. This intriguing finding could mean that FF, while being the primary predictor of performance, is not the only one. Several factors may explain this result, including subtle dissimilarities in sample processing and sequencing procedures between the two time points. These distinctions might have rendered the TST01_7 sample more similar to the training data, facilitating enhanced model generalization and subsequently yielding improved results compared to TST01_19.
Model performance in low depth of coverage
One advantage of DL-based models such over traditional models is their robustness to low quality data, e.g. samples with low sequencing depth [18]. This phenomenon can be attributed to hidden noise and bias patterns that are too complex to model manually but can be learned by AI. Our model demonstrates a distinctive robustness to variations in data quality, for instance, in scenarios with low depth of coverage (Fig. 3). While the Bayesian model exhibit a performance decrease under such conditions, the DL model maintains high accuracy. Hence, the advantage of our model becomes more pronounced as the sequencing depth decreases. As shown in Fig. 3, the advantage of the DL-based model lies primarily in its improved specificity. One possible explanation for this is the naïve Bayes independence assumption, which can lead to an overestimation of the importance of individual mutation-supporting evidence (e.g. read alignments). In contrast, DL-based models, particularly CNNs, evaluate all evidence simultaneously, allowing them to leverage contextual information, such as adjacent reads or neighboring nucleotides on the analyzed DNA sequence. This is indicative of its ability to extract and leverage information from sparse data, a scenario that is especially important in NIPT-M, as further detailed in the discussion section. This robustness is particularly beneficial in early pregnancy, such as at the 7-week gestation sample analyzed in our study, where sequencing often faces challenges such as low input and low coverage, especially when using PCR-free methods to minimize amplification-related errors and allelic bias.

Performance of Hoobari (Bayes) versus DeepHoobari (DL) in varying depth of coverage. Paternal, maternal, Biparental and all correspond to the heterozygosity of the parental DNA; where either the father, mother or both parents are heterozygous, or all categories combined.
Clinical analysis
To demonstrate our approach in a clinical scenario, we sought to showcase the prediction of disease-causing single nucleotide mutations detected in the parents in two families (Table 2). In family TST01, the parents were known carriers of the same mutation in the MED17 gene, leading to autosomal recessive (AR) infantile convulsions and paroxysmal choreoathetosis (ICCA) syndrome. Appearance of the same mutation in both parents is a scenario that is common in consanguineous families and in homogenous populations with a strong founder effect. Such mutations, called homozygous mutations, are extremely difficult to correctly predict, and are typically excluded from genome-wide NIPT-M studies. For a carrier healthy fetus, in week 7, both algorithms incorrectly predicted a non-carrier healthy fetus, which is a tolerable mistake, as it is still considered a negative result. However, in the sample from week 19, the Bayesian algorithm incorrectly predicted an affected fetus, which requires validation through an invasive procedure. The DL-based solution, however, correctly predicted the genotype in this sample, thus preventing unnecessary procedures. Moreover, in the week 7 sample, the probability given by the DL solution to the non-carrier genotype was lower than the one given by the Bayesian approach. In clinical scenarios, a low probability prediction is considered as a no-call, rather than a wrong prediction.
Family . | Family TST01 . | Family TST03 . |
---|---|---|
Phenotype | Infantile Convulsions and Paroxysmal Choreoathetosis (ICCA) Syndrome | Glutaric Acidemia Type IIC |
Gene | MED17 | ETFDH |
Mutationa | chr11:93796509 T > C | Maternal – chr4:158706356A > C Paternal – chr4:158708525 T > G |
Mother | Carrier | Carrier |
Father | Carrier | Carrier |
Mode of Inheritance | ARb (homozygous mutation) | AR (compound heterozygous) |
Offspring (ground truth) | Healthyc (carrier) | Affected |
Bayesian prediction | GWd 7 – Healthy (non-carrier) GW 19 – Affected | Affected |
DL prediction | GW 7 – Healthy (non-carrier) GW 19 – Healthy (carrier) | Affected |
Family . | Family TST01 . | Family TST03 . |
---|---|---|
Phenotype | Infantile Convulsions and Paroxysmal Choreoathetosis (ICCA) Syndrome | Glutaric Acidemia Type IIC |
Gene | MED17 | ETFDH |
Mutationa | chr11:93796509 T > C | Maternal – chr4:158706356A > C Paternal – chr4:158708525 T > G |
Mother | Carrier | Carrier |
Father | Carrier | Carrier |
Mode of Inheritance | ARb (homozygous mutation) | AR (compound heterozygous) |
Offspring (ground truth) | Healthyc (carrier) | Affected |
Bayesian prediction | GWd 7 – Healthy (non-carrier) GW 19 – Affected | Affected |
DL prediction | GW 7 – Healthy (non-carrier) GW 19 – Healthy (carrier) | Affected |
aReference genome build – GRCh38.
bAR – autosomal recessive.
cIn clinical terms, “healthy” and “affected” are reported as low-risk and high-risk results, respectively.
dGW – gestational week.
Family . | Family TST01 . | Family TST03 . |
---|---|---|
Phenotype | Infantile Convulsions and Paroxysmal Choreoathetosis (ICCA) Syndrome | Glutaric Acidemia Type IIC |
Gene | MED17 | ETFDH |
Mutationa | chr11:93796509 T > C | Maternal – chr4:158706356A > C Paternal – chr4:158708525 T > G |
Mother | Carrier | Carrier |
Father | Carrier | Carrier |
Mode of Inheritance | ARb (homozygous mutation) | AR (compound heterozygous) |
Offspring (ground truth) | Healthyc (carrier) | Affected |
Bayesian prediction | GWd 7 – Healthy (non-carrier) GW 19 – Affected | Affected |
DL prediction | GW 7 – Healthy (non-carrier) GW 19 – Healthy (carrier) | Affected |
Family . | Family TST01 . | Family TST03 . |
---|---|---|
Phenotype | Infantile Convulsions and Paroxysmal Choreoathetosis (ICCA) Syndrome | Glutaric Acidemia Type IIC |
Gene | MED17 | ETFDH |
Mutationa | chr11:93796509 T > C | Maternal – chr4:158706356A > C Paternal – chr4:158708525 T > G |
Mother | Carrier | Carrier |
Father | Carrier | Carrier |
Mode of Inheritance | ARb (homozygous mutation) | AR (compound heterozygous) |
Offspring (ground truth) | Healthyc (carrier) | Affected |
Bayesian prediction | GWd 7 – Healthy (non-carrier) GW 19 – Affected | Affected |
DL prediction | GW 7 – Healthy (non-carrier) GW 19 – Healthy (carrier) | Affected |
aReference genome build – GRCh38.
bAR – autosomal recessive.
cIn clinical terms, “healthy” and “affected” are reported as low-risk and high-risk results, respectively.
dGW – gestational week.
In family TST03, both parents were carriers of two different mutations in ETFDH gene, leading to AR Glutaric Acidemia Type IIC. This condition typically manifest with facial deformities, enlarged liver, hypoglycemia, acidosis, muscle weakness and heart conditions [19]. Such inheritance mode is termed compound heterozygosity, or sometimes referred to as heterozygous recessive mutation. Predicting the paternal mutation inheritance is straightforward, thus enabling to rule out a disease. However, the prediction of the maternal mutation inheritance is bioinformatically challenging, creating difficulty to predict a high risk for an affected fetus. In this case, both the DL and Bayesian approaches managed to correctly predict an affected fetus.
Discussion
This study introduced a DL-based model advancing NIPT-M, demonstrating noninvasive prediction of fetal inheritance with notable accuracy. The model outperformed existing standards across inheritance categories, especially in the most challenging scenario of dual heterozygosity scenarios. Utilizing ultra-deep WGS of cfDNA from a dataset of 10 family trios, each contributing millions of genetic variants, the approach processed extensive genetic variant data, highlighting its comprehensiveness. For comparison, the first attempts for DL variant calling and non-DL fetal genotyping involved a single genome [7]. The model accurately assessed mutation inheritance in all test cases involving parental carriers of deleterious SNVs, effective as early as the 7th week of gestation, the earliest week in which genome-wide NIPT-M was performed so far.
Our method is the first instance of a DL model being applied to NIPT-M, and one of a handful that apply DL to NIPT in general. Its success indicates potential broader uses in variant calling, pushing genomic research boundaries. This is particularly relevant in tumor liquid biopsy, where DL models could offer notable advantages. DeepHoobari, unlike previous models such as Hoobari, incorporates complex relationships between DNA fragments within a unified tensor, enabling contextual analysis not possible with Bayesian models that treat fragments as independent variables. Another example for the potential of using a DL model was shown in cases sparse data, e.g. low FF in early pregnancy stages, where it maintains robust performance. Such capability in both high and low FF scenarios distinguish our model from recent methods, especially in early pregnancy stages where accurate diagnostics are crucial for actionable clinical outcomes.
Our model incorporates novel aspects in design and data representation, diverging from previous genomics DL methods due to unique data characteristics. For example, full nucleotide sequencing of each fragment, despite providing raw data which can contribute hidden features, offered limited benefits for its high computational demand, leading us to more efficient data integration strategies that minimized redundancy, utilizing diverse data sources. This approach, prompted by data complexity and resource limitations, may have broader implications for variant calling. For instance, our use of trio context data prefigured similar developments [20], affirming the validity of our method. Additionally, we integrated fragmentomics, studying cfDNA fragment distinctions, unlike prior DL variant calling models that overlooked such features due to their development using lab-induced, artificial DNA fragmentation. Incorporating fragment length and per-fragment FF, as initially proposed in Hoobari, our model uniquely applies these fragmentomics insights, enhancing its applicability and efficiency in genomics.
This study represents a notable leap in non-invasive fetal variant prediction, though it is crucial to note its limitations. The research primarily focused on SNVs, with small indels presenting a future research opportunity. Enhancing model explainability is vital for clinical adoption, given the challenges of using "black box" models in medical settings. While our dataset splitting strategy minimized data leakage risk, this could still occur if many rare variants are shared across training and test families. Using unrelated families, where low-frequency variants are generally not shared, and focusing on DNA-related features rather than genomic coordinates further reduces this risk. Future studies should aim to minimize shared rare variants to enhance model generalizability. Efforts to boost accuracy, especially for maternally inherited genes and at lower FFs, are ongoing and may benefit from larger training datasets and the integration of haplotype information. While the model demonstrates strong performance at low depths of coverage, further exploration of features related to false-positive and false-negative results would be beneficial. Such an investigation could contribute to the development of improved models. Overall, these enhancements promise to refine the precision and applicability of the model across genetic variations. Ultra-deep WGS limits the applicability of our method; it is costly, requires sufficient genetic material in early pregnancy stages; increases computational demand, especially when combined with DL methods that requires graphics processing units (GPUs). Despite relying on ultra-deep WGS, the flexibility of DL methods to adapt to various sequencing data, including WES and newer, more affordable technologies [21], suggests a broad applicability.
This study underscores a significant proof-of-concept toward achieving NIPT that is equivalent to invasive tests, heralding a future of risk-free prenatal assessment.
Methods
Familial DNA acquisition
Sample collection and DNA extraction
Ten families were recruited by the Raphael Recanati Genetic Institute at Rabin Medical Center (Supplementary Table S1). Samples from the families were collected during 7–33 (median 12) weeks of gestation with informed consent. Maternal and paternal WGS data were obtained from peripheral blood mononuclear cells (PBMCs), while cfDNA was sequenced from maternal plasma. For fetal genotype validation, DNA was extracted from pure fetal tissue obtained through amniocentesis or CVS. The DNA extraction process from CVS or amniocentesis samples utilized the magLEAD 12gC, MagDEA Dx kit (ExScale, Chiba, Japan). Maternal blood was collected using 2–4 Ethylene-diamine-tetra-acetic acid (EDTA) tubes, followed by plasma separation through centrifugation at room temperature for 10 min at 1600 × g. A subsequent centrifugation step at 16 000 × g for 10 min at room temperature removed any residual cells. The extraction of cfDNA was carried out using the QIAamp Circulating Nucleic Acid Kit (Qiagen). Parental genomic DNA was extracted from PBMCs following a routine protocol, which involved buffy coat separation and DNA purification using the magLEAD 12gC, MagDEA Dx kit (ExScale, Chiba, Japan), in accordance with the manufacturer’s instructions. The collection and purification of pure paternal DNA followed a similar procedure.
Library preparation and sequencing
Preparing WGS libraries for the aforementioned samples was performed using TruSeq DNA PCR-Free Library Prep Kit (Illumina) for genomic DNA and the Accel-NGS 2S PCR-free Library Prep Kit for cfDNA samples, following the provided manufacturer’s guidelines. Subsequently, NovaSeq (Illumina) sequencing was employed, generating 150 paired end reads for all DNA samples within each family. The sequencing procedures adhered to a standard target depth of 30× for parental and fetal genomic DNA libraries, while cfDNA underwent sequencing with a target depth of 300×.
Variant candidate computational representation
Following the splitting of the datasets and the sampling process, variants were transformed into tensors. This process involved integrating information from various sources, including the maternal and paternal genomic DNA, and plasma cfDNA. It also involved combining different levels of data, from high-resolution data type, to data that is more sparse: (i) sample-level features, e.g. the fetal fraction; (ii) genomic site-level features, e.g. the parental genotype, the chromosome and position; (iii) fragment-level features, such as the quality of mapping to the reference genome; and (iv) nucleotide-level features, like the supported allele. All information was then embedded into a feature matrix for each fetal variant candidate, as illustrated in Supplementary Fig. S1. Features were calculated mainly using in-house code, and partly using off-the-shelf tools, such as SAMTools [22].
Model development
To develop an accurate and generalizable and robust model, we followed a systematic methodology (see Supplementary Material). We explored various model architectures and data representations, meticulously training using the designated training set, and evaluating using the validation set, iteratively refining our approach. Compared architectures included recurrent neural networks (RNNs) such as Long Short-Term Memory (LSTM), different convolutional neural networks (CNNs), various attention solutions and ensembles of several sub-networks. Our findings indicated that ResNet34 [23] demonstrated the optimal performance for addressing the specific problem at hand.
We then compared our architecture to Hoobari using the same data as in Rabinowitz et al., 2019 [13]. Model development initially relied on three families (G1, G2, G5) described in the same study. However, each of these families was sequenced using different machine types, protocols, and techniques (e.g. read lengths in G1–2 were 75 bp, compared with 151 bp in G5), which led to challenges in uniformity and comparability. Models trained on G1–2 performed better but failed to generalize to G5 due to technical discrepancies. Alternatively, training on single families also introduced bias and noise. Ultimately, the study focused on 10 new samples for consistency, with G1, G2, and G5 used solely during the early stages of model development, as presented in the supplementary material (Supplemental Table S4).
An important part of our training process included optimizing hyperparameters, with the widely used Optuna Hyperparameter Optimization Framework [24] for the generation of multiple trials. In a series of experiments, we fine-tuned the number of training steps and value ranges for the hyperparameters. In these experiments we iteratively increased the number of training steps while decreasing the number of trials and value ranges, until the evaluation loss plateaued.
The ResNet34 model was then trained using our training set, until we observed a plateau in the training loss and a characteristic curvature in the validation loss. The final model was then tested once on the unseen test set.
Datasets
To prevent overfitting of the model to our data, extra precautions were used. First, information (or data) leakage was avoided in several strategies, such as using different families and even different chromosomes for each dataset: the training set contained variants from several families, consisting of all chromosomes excluding chromosomes 5 and 20, and the validation set consisted of variants from chromosome 5 of families that were not included in the training set (Supplemental Table S1). A third set, the test set, consisted of chromosome 20 variants from families excluded from both the training and validation sets (Supplementary Table S1). The test set was withheld until the study's conclusion and tested only once with the final model. Excluding entire chromosomes from training data prevented overlapping reads or read pairs between adjacent variants, thus avoiding data leakage. Chromosome 20 was chosen for testing, as is commonly practiced in genomics, because it is small, less prone to structural variation than chromosomes 21 and 22, and has high gene and variant density, making it a robust testing choice.
After splitting the data, sets of millions of genetic variants per family were created. This was performed by randomly selecting candidate variants from the relevant chromosomes within each family, i.e. genomic loci in which either the father or the mother has a genetic variant that the fetus might inherit. Datasets were then filtered to keep only variants with sufficient data to predict upon, using previously described criteria [4], thus making sure that our results could be directly compared with existing standards, and guaranteeing a consistent evaluation process.
Noninvasive prenatal testing for point mutations causing monogenic diseases (NIPT-M) remains an unsolved challenge, restricted to late pregnancy stages where clinical value is reduced.
Previously, we framed this challenge as a specialized variant calling problem, addressing it with a Bayesian-based algorithm integrated with machine learning (ML) methods. Other methods published since then have followed this approach, even though standard genotyping of individual genomes has advanced through deep learning (DL) algorithms, such as Google's DeepVariant.
This study presents the first DL-based framework for cfDNA-based genotyping, effectively leveraging the ultra-deep whole genome sequencing (WGS) data that is typical to NIPT-M. It also accounts for fragmentomics information by learning the nuanced differences between fetal and maternal DNA fragments. These innovations can also enhance liquid biopsy techniques and advance other DL algorithms in genomics.
Using this novel approach, we outperform current methods, detecting three deleterious mutations and enabling NIPT-M as early as the 7th week of gestation.
Our method brings genome-wide NIPT for all mutation types closer to clinical implementation, enabling families and healthcare providers to make more informed decisions while reducing the uncertainties and anxieties of pregnancy.
Acknowledgments
We would like to thank Identifai-Genetics Ltd. for the access to their advanced cloud infrastructure that enabled us to train and test our model over large amounts of data. We also thank Prof. Lior Wolf who aided with the initial stages of the model development process; Meitar Grad for her assistance with sample processing; Dr. Noa Liscovitch-Brauer, Ravit Mesika and Itamar Tsayag for their valuable input; and Psomagen, Inc. for the sequencing process. This study was supported in part by a fellowship from the Edmond J. Safra Center for Bioinformatics at Tel-Aviv University. Some icons in Fig. 1 (pregnant couple; clinical report) were created using ChatGPT.
Conflict of interest: T.R. is a shareholder, and N.S. is an employee and shareholder in Identifai-Genetics Ltd.
Funding
None declared.
Data availability
The data, source code and model weights that support the findings of this study are available upon reasonable request.
Author contributions
T.R. and N.S. designed the research study; R.T.M. and L.B.S. recruited relevant participants and collected samples; T.R. defined the datasets and experimental design; Y.S. and T.R. developed the algorithm; Y.S. developed the software, conducted the training and test processes, and optimized the method; T.R. performed the clinical and depth analyses; T.R., Y.S. and N.S. wrote the manuscript; All authors read and approved the final manuscript.
Ethics declaration
Human genetic investigations were conducted following the guidelines of the primary institutional ethics committee at Rabin Medical Center, Beilinson Hospital, Petah Tikva, Israel (approval number 0825–15-RMC, granted on 12 July 2016), and was approved by The Israel Ministry of Health's Clinical Trials Department, under the national reference number 920160014. Informed consent was obtained and archived from all participants, and any clinical data has been de-identified.
References
Chan KCA, Jiang P, Sun K. et al.
Rabinowitz T, Polsky A, Golan D. et al. .
Poplin R, Chang PC, Alexander D. et al.
Almogy G, Pratt M, Oberstrass F. et al.
Akiba T, Sano S, Yanase T.
Author notes
Yonathan Schwammenthal and Tom Rabinowitz contributed equally.