Noninvasive fetal genotyping using deep neural networks

AUC-ROC and AP of the test families, Hoobari versus DeepHoobari comparison.

			AUC-ROC		AP
Family	FF	Inheritance	Hoobari	DeepHoobari	Hoobari	DeepHoobari
TST01 (7 weeks gestation)	8.90%	Paternal	0.9760	0.9994	0.8461	0.9939
		Maternal	0.7738	0.8025	0.6646	0.7358
		Both parents	0.5749	0.7189	0.4299	0.6356
TST01 (19 weeks gestation)	10.44%	Paternal	0.9884	0.9997	0.8145	0.9882
		Maternal	0.8750	0.8778	0.6266	0.6855
		Both parents	0.6521	0.8424	0.4175	0.6153
TST02	7.02%	Paternal	0.9674	0.9963	0.8616	0.9884
		Maternal	0.6817	0.7153	0.6549	0.7153
		Both parents	0.5102	0.6562	0.4176	0.5563
TST03	15.18%	Paternal	0.9698	0.9976	0.8372	0.9973
		Maternal	0.7299	0.7601	0.8480	0.8749
		Both parents	0.5218	0.7105	0.5523	0.8066

			AUC-ROC		AP
Family	FF	Inheritance	Hoobari	DeepHoobari	Hoobari	DeepHoobari
TST01 (7 weeks gestation)	8.90%	Paternal	0.9760	0.9994	0.8461	0.9939
		Maternal	0.7738	0.8025	0.6646	0.7358
		Both parents	0.5749	0.7189	0.4299	0.6356
TST01 (19 weeks gestation)	10.44%	Paternal	0.9884	0.9997	0.8145	0.9882
		Maternal	0.8750	0.8778	0.6266	0.6855
		Both parents	0.6521	0.8424	0.4175	0.6153
TST02	7.02%	Paternal	0.9674	0.9963	0.8616	0.9884
		Maternal	0.6817	0.7153	0.6549	0.7153
		Both parents	0.5102	0.6562	0.4176	0.5563
TST03	15.18%	Paternal	0.9698	0.9976	0.8372	0.9973
		Maternal	0.7299	0.7601	0.8480	0.8749
		Both parents	0.5218	0.7105	0.5523	0.8066

Table 1

Open in new tab Download slide

AUC-ROC and AP of the test families, Hoobari versus DeepHoobari comparison.

			AUC-ROC		AP
Family	FF	Inheritance	Hoobari	DeepHoobari	Hoobari	DeepHoobari
TST01 (7 weeks gestation)	8.90%	Paternal	0.9760	0.9994	0.8461	0.9939
		Maternal	0.7738	0.8025	0.6646	0.7358
		Both parents	0.5749	0.7189	0.4299	0.6356
TST01 (19 weeks gestation)	10.44%	Paternal	0.9884	0.9997	0.8145	0.9882
		Maternal	0.8750	0.8778	0.6266	0.6855
		Both parents	0.6521	0.8424	0.4175	0.6153
TST02	7.02%	Paternal	0.9674	0.9963	0.8616	0.9884
		Maternal	0.6817	0.7153	0.6549	0.7153
		Both parents	0.5102	0.6562	0.4176	0.5563
TST03	15.18%	Paternal	0.9698	0.9976	0.8372	0.9973
		Maternal	0.7299	0.7601	0.8480	0.8749
		Both parents	0.5218	0.7105	0.5523	0.8066

			AUC-ROC		AP
Family	FF	Inheritance	Hoobari	DeepHoobari	Hoobari	DeepHoobari
TST01 (7 weeks gestation)	8.90%	Paternal	0.9760	0.9994	0.8461	0.9939
		Maternal	0.7738	0.8025	0.6646	0.7358
		Both parents	0.5749	0.7189	0.4299	0.6356
TST01 (19 weeks gestation)	10.44%	Paternal	0.9884	0.9997	0.8145	0.9882
		Maternal	0.8750	0.8778	0.6266	0.6855
		Both parents	0.6521	0.8424	0.4175	0.6153
TST02	7.02%	Paternal	0.9674	0.9963	0.8616	0.9884
		Maternal	0.6817	0.7153	0.6549	0.7153
		Both parents	0.5102	0.6562	0.4176	0.5563
TST03	15.18%	Paternal	0.9698	0.9976	0.8372	0.9973
		Maternal	0.7299	0.7601	0.8480	0.8749
		Both parents	0.5218	0.7105	0.5523	0.8066

Figure 2

ROC curves (A–C) and precision-recall (PR) curves (E–F) of the test families, Hoobari (Bayesian model) versus DeepHoobari (DL model) comparison. AUC corresponds to area under the curve, and AR is the AP. Paternal, maternal and Biparental correspond to the heterozygosity of the parental DNA; where either the father, mother or both parents are heterozygous. Curves shown describe the average (bold line) and range (transparent color) of the ROC curves of the four test families, for Hoobari (red) and deepHoobari (blue). The lower end of the range corresponds to families with low FF, and the higher end corresponds to the high FF families.

Figure 2D–F further illustrates these trends in performance. The difference in AP between panels D and E can be attributed to the model’s improved ability to distinguish fetal cfDNA from background maternal DNA in paternal-heterozygous mutations, where the fetal signal is more distinct and less confounded by maternal sequences. In contrast, biparental variants remain the most challenging due to overlapping signals from dual heterozygous mutations, which complicate precise predictions. Nevertheless, the DL model demonstrated clear advantages over the Bayesian method even in these difficult cases, underscoring its robustness across all inheritance scenarios.

Notably, the above comparison demonstrates the capability of the DL-based approach to surpass the previous methods across all variant categories, establishing highly accurate results in diverse scenarios. The model demonstrates a significant improvement in accurate genotyping of biparental cases, ranging from 0.14 to 0.19 AUC points. To emphasize, biparental variants are typically overlooked in previous studies, due to their difficult analysis.

A discrepancy in results within family TST01 was noticed when data sampled in week 19 was compared to week 7. This intriguing finding could mean that FF, while being the primary predictor of performance, is not the only one. Several factors may explain this result, including subtle dissimilarities in sample processing and sequencing procedures between the two time points. These distinctions might have rendered the TST01_7 sample more similar to the training data, facilitating enhanced model generalization and subsequently yielding improved results compared to TST01_19.

Model performance in low depth of coverage

One advantage of DL-based models such over traditional models is their robustness to low quality data, e.g. samples with low sequencing depth [18]. This phenomenon can be attributed to hidden noise and bias patterns that are too complex to model manually but can be learned by AI. Our model demonstrates a distinctive robustness to variations in data quality, for instance, in scenarios with low depth of coverage (Fig. 3). While the Bayesian model exhibit a performance decrease under such conditions, the DL model maintains high accuracy. Hence, the advantage of our model becomes more pronounced as the sequencing depth decreases. As shown in Fig. 3, the advantage of the DL-based model lies primarily in its improved specificity. One possible explanation for this is the naïve Bayes independence assumption, which can lead to an overestimation of the importance of individual mutation-supporting evidence (e.g. read alignments). In contrast, DL-based models, particularly CNNs, evaluate all evidence simultaneously, allowing them to leverage contextual information, such as adjacent reads or neighboring nucleotides on the analyzed DNA sequence. This is indicative of its ability to extract and leverage information from sparse data, a scenario that is especially important in NIPT-M, as further detailed in the discussion section. This robustness is particularly beneficial in early pregnancy, such as at the 7-week gestation sample analyzed in our study, where sequencing often faces challenges such as low input and low coverage, especially when using PCR-free methods to minimize amplification-related errors and allelic bias.

Figure 3

Performance of Hoobari (Bayes) versus DeepHoobari (DL) in varying depth of coverage. Paternal, maternal, Biparental and all correspond to the heterozygosity of the parental DNA; where either the father, mother or both parents are heterozygous, or all categories combined.

Open in new tab Download slide

Clinical analysis

To demonstrate our approach in a clinical scenario, we sought to showcase the prediction of disease-causing single nucleotide mutations detected in the parents in two families (Table 2). In family TST01, the parents were known carriers of the same mutation in the MED17 gene, leading to autosomal recessive (AR) infantile convulsions and paroxysmal choreoathetosis (ICCA) syndrome. Appearance of the same mutation in both parents is a scenario that is common in consanguineous families and in homogenous populations with a strong founder effect. Such mutations, called homozygous mutations, are extremely difficult to correctly predict, and are typically excluded from genome-wide NIPT-M studies. For a carrier healthy fetus, in week 7, both algorithms incorrectly predicted a non-carrier healthy fetus, which is a tolerable mistake, as it is still considered a negative result. However, in the sample from week 19, the Bayesian algorithm incorrectly predicted an affected fetus, which requires validation through an invasive procedure. The DL-based solution, however, correctly predicted the genotype in this sample, thus preventing unnecessary procedures. Moreover, in the week 7 sample, the probability given by the DL solution to the non-carrier genotype was lower than the one given by the Bayesian approach. In clinical scenarios, a low probability prediction is considered as a no-call, rather than a wrong prediction.

Table 2

Clinical findings in test families.

Family	Family TST01	Family TST03
Phenotype	Infantile Convulsions and Paroxysmal Choreoathetosis (ICCA) Syndrome	Glutaric Acidemia Type IIC
Gene	MED17	ETFDH
Mutation^a	chr11:93796509 T > C	Maternal – chr4:158706356A > C Paternal – chr4:158708525 T > G
Mother	Carrier	Carrier
Father	Carrier	Carrier
Mode of Inheritance	AR^b (homozygous mutation)	AR (compound heterozygous)
Offspring (ground truth)	Healthy^c (carrier)	Affected
Bayesian prediction	GW^d 7 – Healthy (non-carrier) GW 19 – Affected	Affected
DL prediction	GW 7 – Healthy (non-carrier) GW 19 – Healthy (carrier)	Affected

Family	Family TST01	Family TST03
Phenotype	Infantile Convulsions and Paroxysmal Choreoathetosis (ICCA) Syndrome	Glutaric Acidemia Type IIC
Gene	MED17	ETFDH
Mutation^a	chr11:93796509 T > C	Maternal – chr4:158706356A > C Paternal – chr4:158708525 T > G
Mother	Carrier	Carrier
Father	Carrier	Carrier
Mode of Inheritance	AR^b (homozygous mutation)	AR (compound heterozygous)
Offspring (ground truth)	Healthy^c (carrier)	Affected
Bayesian prediction	GW^d 7 – Healthy (non-carrier) GW 19 – Affected	Affected
DL prediction	GW 7 – Healthy (non-carrier) GW 19 – Healthy (carrier)	Affected

^aReference genome build – GRCh38.

^bAR – autosomal recessive.

^cIn clinical terms, “healthy” and “affected” are reported as low-risk and high-risk results, respectively.

^dGW – gestational week.

Table 2

10.1373/clinchem.2011.180794

Clinical findings in test families.

Family	Family TST01	Family TST03
Phenotype	Infantile Convulsions and Paroxysmal Choreoathetosis (ICCA) Syndrome	Glutaric Acidemia Type IIC
Gene	MED17	ETFDH
Mutation^a	chr11:93796509 T > C	Maternal – chr4:158706356A > C Paternal – chr4:158708525 T > G
Mother	Carrier	Carrier
Father	Carrier	Carrier
Mode of Inheritance	AR^b (homozygous mutation)	AR (compound heterozygous)
Offspring (ground truth)	Healthy^c (carrier)	Affected
Bayesian prediction	GW^d 7 – Healthy (non-carrier) GW 19 – Affected	Affected
DL prediction	GW 7 – Healthy (non-carrier) GW 19 – Healthy (carrier)	Affected

Family	Family TST01	Family TST03
Phenotype	Infantile Convulsions and Paroxysmal Choreoathetosis (ICCA) Syndrome	Glutaric Acidemia Type IIC
Gene	MED17	ETFDH
Mutation^a	chr11:93796509 T > C	Maternal – chr4:158706356A > C Paternal – chr4:158708525 T > G
Mother	Carrier	Carrier
Father	Carrier	Carrier
Mode of Inheritance	AR^b (homozygous mutation)	AR (compound heterozygous)
Offspring (ground truth)	Healthy^c (carrier)	Affected
Bayesian prediction	GW^d 7 – Healthy (non-carrier) GW 19 – Affected	Affected
DL prediction	GW 7 – Healthy (non-carrier) GW 19 – Healthy (carrier)	Affected

^aReference genome build – GRCh38.

^bAR – autosomal recessive.

^cIn clinical terms, “healthy” and “affected” are reported as low-risk and high-risk results, respectively.

^dGW – gestational week.

In family TST03, both parents were carriers of two different mutations in ETFDH gene, leading to AR Glutaric Acidemia Type IIC. This condition typically manifest with facial deformities, enlarged liver, hypoglycemia, acidosis, muscle weakness and heart conditions [19]. Such inheritance mode is termed compound heterozygosity, or sometimes referred to as heterozygous recessive mutation. Predicting the paternal mutation inheritance is straightforward, thus enabling to rule out a disease. However, the prediction of the maternal mutation inheritance is bioinformatically challenging, creating difficulty to predict a high risk for an affected fetus. In this case, both the DL and Bayesian approaches managed to correctly predict an affected fetus.

Discussion

This study introduced a DL-based model advancing NIPT-M, demonstrating noninvasive prediction of fetal inheritance with notable accuracy. The model outperformed existing standards across inheritance categories, especially in the most challenging scenario of dual heterozygosity scenarios. Utilizing ultra-deep WGS of cfDNA from a dataset of 10 family trios, each contributing millions of genetic variants, the approach processed extensive genetic variant data, highlighting its comprehensiveness. For comparison, the first attempts for DL variant calling and non-DL fetal genotyping involved a single genome [7]. The model accurately assessed mutation inheritance in all test cases involving parental carriers of deleterious SNVs, effective as early as the 7th week of gestation, the earliest week in which genome-wide NIPT-M was performed so far.

Our method is the first instance of a DL model being applied to NIPT-M, and one of a handful that apply DL to NIPT in general. Its success indicates potential broader uses in variant calling, pushing genomic research boundaries. This is particularly relevant in tumor liquid biopsy, where DL models could offer notable advantages. DeepHoobari, unlike previous models such as Hoobari, incorporates complex relationships between DNA fragments within a unified tensor, enabling contextual analysis not possible with Bayesian models that treat fragments as independent variables. Another example for the potential of using a DL model was shown in cases sparse data, e.g. low FF in early pregnancy stages, where it maintains robust performance. Such capability in both high and low FF scenarios distinguish our model from recent methods, especially in early pregnancy stages where accurate diagnostics are crucial for actionable clinical outcomes.

Our model incorporates novel aspects in design and data representation, diverging from previous genomics DL methods due to unique data characteristics. For example, full nucleotide sequencing of each fragment, despite providing raw data which can contribute hidden features, offered limited benefits for its high computational demand, leading us to more efficient data integration strategies that minimized redundancy, utilizing diverse data sources. This approach, prompted by data complexity and resource limitations, may have broader implications for variant calling. For instance, our use of trio context data prefigured similar developments [20], affirming the validity of our method. Additionally, we integrated fragmentomics, studying cfDNA fragment distinctions, unlike prior DL variant calling models that overlooked such features due to their development using lab-induced, artificial DNA fragmentation. Incorporating fragment length and per-fragment FF, as initially proposed in Hoobari, our model uniquely applies these fragmentomics insights, enhancing its applicability and efficiency in genomics.

This study represents a notable leap in non-invasive fetal variant prediction, though it is crucial to note its limitations. The research primarily focused on SNVs, with small indels presenting a future research opportunity. Enhancing model explainability is vital for clinical adoption, given the challenges of using "black box" models in medical settings. While our dataset splitting strategy minimized data leakage risk, this could still occur if many rare variants are shared across training and test families. Using unrelated families, where low-frequency variants are generally not shared, and focusing on DNA-related features rather than genomic coordinates further reduces this risk. Future studies should aim to minimize shared rare variants to enhance model generalizability. Efforts to boost accuracy, especially for maternally inherited genes and at lower FFs, are ongoing and may benefit from larger training datasets and the integration of haplotype information. While the model demonstrates strong performance at low depths of coverage, further exploration of features related to false-positive and false-negative results would be beneficial. Such an investigation could contribute to the development of improved models. Overall, these enhancements promise to refine the precision and applicability of the model across genetic variations. Ultra-deep WGS limits the applicability of our method; it is costly, requires sufficient genetic material in early pregnancy stages; increases computational demand, especially when combined with DL methods that requires graphics processing units (GPUs). Despite relying on ultra-deep WGS, the flexibility of DL methods to adapt to various sequencing data, including WES and newer, more affordable technologies [21], suggests a broad applicability.

This study underscores a significant proof-of-concept toward achieving NIPT that is equivalent to invasive tests, heralding a future of risk-free prenatal assessment.

Methods

Familial DNA acquisition

Sample collection and DNA extraction

Ten families were recruited by the Raphael Recanati Genetic Institute at Rabin Medical Center (Supplementary Table S1). Samples from the families were collected during 7–33 (median 12) weeks of gestation with informed consent. Maternal and paternal WGS data were obtained from peripheral blood mononuclear cells (PBMCs), while cfDNA was sequenced from maternal plasma. For fetal genotype validation, DNA was extracted from pure fetal tissue obtained through amniocentesis or CVS. The DNA extraction process from CVS or amniocentesis samples utilized the magLEAD 12gC, MagDEA Dx kit (ExScale, Chiba, Japan). Maternal blood was collected using 2–4 Ethylene-diamine-tetra-acetic acid (EDTA) tubes, followed by plasma separation through centrifugation at room temperature for 10 min at 1600 × g. A subsequent centrifugation step at 16 000 × g for 10 min at room temperature removed any residual cells. The extraction of cfDNA was carried out using the QIAamp Circulating Nucleic Acid Kit (Qiagen). Parental genomic DNA was extracted from PBMCs following a routine protocol, which involved buffy coat separation and DNA purification using the magLEAD 12gC, MagDEA Dx kit (ExScale, Chiba, Japan), in accordance with the manufacturer’s instructions. The collection and purification of pure paternal DNA followed a similar procedure.

Library preparation and sequencing

Preparing WGS libraries for the aforementioned samples was performed using TruSeq DNA PCR-Free Library Prep Kit (Illumina) for genomic DNA and the Accel-NGS 2S PCR-free Library Prep Kit for cfDNA samples, following the provided manufacturer’s guidelines. Subsequently, NovaSeq (Illumina) sequencing was employed, generating 150 paired end reads for all DNA samples within each family. The sequencing procedures adhered to a standard target depth of 30× for parental and fetal genomic DNA libraries, while cfDNA underwent sequencing with a target depth of 300×.

Variant candidate computational representation

Following the splitting of the datasets and the sampling process, variants were transformed into tensors. This process involved integrating information from various sources, including the maternal and paternal genomic DNA, and plasma cfDNA. It also involved combining different levels of data, from high-resolution data type, to data that is more sparse: (i) sample-level features, e.g. the fetal fraction; (ii) genomic site-level features, e.g. the parental genotype, the chromosome and position; (iii) fragment-level features, such as the quality of mapping to the reference genome; and (iv) nucleotide-level features, like the supported allele. All information was then embedded into a feature matrix for each fetal variant candidate, as illustrated in Supplementary Fig. S1. Features were calculated mainly using in-house code, and partly using off-the-shelf tools, such as SAMTools [22].

Model development

To develop an accurate and generalizable and robust model, we followed a systematic methodology (see Supplementary Material). We explored various model architectures and data representations, meticulously training using the designated training set, and evaluating using the validation set, iteratively refining our approach. Compared architectures included recurrent neural networks (RNNs) such as Long Short-Term Memory (LSTM), different convolutional neural networks (CNNs), various attention solutions and ensembles of several sub-networks. Our findings indicated that ResNet34 [23] demonstrated the optimal performance for addressing the specific problem at hand.

We then compared our architecture to Hoobari using the same data as in Rabinowitz et al., 2019 [13]. Model development initially relied on three families (G1, G2, G5) described in the same study. However, each of these families was sequenced using different machine types, protocols, and techniques (e.g. read lengths in G1–2 were 75 bp, compared with 151 bp in G5), which led to challenges in uniformity and comparability. Models trained on G1–2 performed better but failed to generalize to G5 due to technical discrepancies. Alternatively, training on single families also introduced bias and noise. Ultimately, the study focused on 10 new samples for consistency, with G1, G2, and G5 used solely during the early stages of model development, as presented in the supplementary material (Supplemental Table S4).

An important part of our training process included optimizing hyperparameters, with the widely used Optuna Hyperparameter Optimization Framework [24] for the generation of multiple trials. In a series of experiments, we fine-tuned the number of training steps and value ranges for the hyperparameters. In these experiments we iteratively increased the number of training steps while decreasing the number of trials and value ranges, until the evaluation loss plateaued.

The ResNet34 model was then trained using our training set, until we observed a plateau in the training loss and a characteristic curvature in the validation loss. The final model was then tested once on the unseen test set.

Datasets

To prevent overfitting of the model to our data, extra precautions were used. First, information (or data) leakage was avoided in several strategies, such as using different families and even different chromosomes for each dataset: the training set contained variants from several families, consisting of all chromosomes excluding chromosomes 5 and 20, and the validation set consisted of variants from chromosome 5 of families that were not included in the training set (Supplemental Table S1). A third set, the test set, consisted of chromosome 20 variants from families excluded from both the training and validation sets (Supplementary Table S1). The test set was withheld until the study's conclusion and tested only once with the final model. Excluding entire chromosomes from training data prevented overlapping reads or read pairs between adjacent variants, thus avoiding data leakage. Chromosome 20 was chosen for testing, as is commonly practiced in genomics, because it is small, less prone to structural variation than chromosomes 21 and 22, and has high gene and variant density, making it a robust testing choice.

After splitting the data, sets of millions of genetic variants per family were created. This was performed by randomly selecting candidate variants from the relevant chromosomes within each family, i.e. genomic loci in which either the father or the mother has a genetic variant that the fetus might inherit. Datasets were then filtered to keep only variants with sufficient data to predict upon, using previously described criteria [4], thus making sure that our results could be directly compared with existing standards, and guaranteeing a consistent evaluation process.

Key Points

Noninvasive prenatal testing for point mutations causing monogenic diseases (NIPT-M) remains an unsolved challenge, restricted to late pregnancy stages where clinical value is reduced.
Previously, we framed this challenge as a specialized variant calling problem, addressing it with a Bayesian-based algorithm integrated with machine learning (ML) methods. Other methods published since then have followed this approach, even though standard genotyping of individual genomes has advanced through deep learning (DL) algorithms, such as Google's DeepVariant.
This study presents the first DL-based framework for cfDNA-based genotyping, effectively leveraging the ultra-deep whole genome sequencing (WGS) data that is typical to NIPT-M. It also accounts for fragmentomics information by learning the nuanced differences between fetal and maternal DNA fragments. These innovations can also enhance liquid biopsy techniques and advance other DL algorithms in genomics.
Using this novel approach, we outperform current methods, detecting three deleterious mutations and enabling NIPT-M as early as the 7th week of gestation.
Our method brings genome-wide NIPT for all mutation types closer to clinical implementation, enabling families and healthcare providers to make more informed decisions while reducing the uncertainties and anxieties of pregnancy.

Acknowledgments

We would like to thank Identifai-Genetics Ltd. for the access to their advanced cloud infrastructure that enabled us to train and test our model over large amounts of data. We also thank Prof. Lior Wolf who aided with the initial stages of the model development process; Meitar Grad for her assistance with sample processing; Dr. Noa Liscovitch-Brauer, Ravit Mesika and Itamar Tsayag for their valuable input; and Psomagen, Inc. for the sequencing process. This study was supported in part by a fellowship from the Edmond J. Safra Center for Bioinformatics at Tel-Aviv University. Some icons in Fig. 1 (pregnant couple; clinical report) were created using ChatGPT.

Conflict of interest: T.R. is a shareholder, and N.S. is an employee and shareholder in Identifai-Genetics Ltd.

Funding

None declared.

Data availability

The data, source code and model weights that support the findings of this study are available upon reasonable request.

Author contributions

T.R. and N.S. designed the research study; R.T.M. and L.B.S. recruited relevant participants and collected samples; T.R. defined the datasets and experimental design; Y.S. and T.R. developed the algorithm; Y.S. developed the software, conducted the training and test processes, and optimized the method; T.R. performed the clinical and depth analyses; T.R., Y.S. and N.S. wrote the manuscript; All authors read and approved the final manuscript.

Ethics declaration

Human genetic investigations were conducted following the guidelines of the primary institutional ethics committee at Rabin Medical Center, Beilinson Hospital, Petah Tikva, Israel (approval number 0825–15-RMC, granted on 12 July 2016), and was approved by The Israel Ministry of Health's Clinical Trials Department, under the national reference number 920160014. Informed consent was obtained and archived from all participants, and any clinical data has been de-identified.

References

Fan

Blumenfeld

Chitkara

. et al.

Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood

Proc Natl Acad Sci USA

2008

;

105

16266

–

10.1073/pnas.0808319105

YMD

Lun

FMF

Chan

KCA

. et al.

Digital PCR for the molecular detection of fetal chromosomal aneuploidy

PNAS

2007

;

104

13116

–

10.1073/pnas.0705765104

Jensen

Dzakula

Deciu

. et al.

Detection of microdeletion 22q11.2 in a fetus by next-generation sequencing of maternal plasma

Clin Chem

2012

;

1148

–

Barrett

McDonnell

TCR

Chan

KCA

. et al.

Digital PCR analysis of maternal plasma for noninvasive detection of sickle cell Anemia

Clin Chem

2012

;

1026

–

10.1373/clinchem.2011.178939

Saito

Sekizawa

Morimoto

. et al.

Prenatal DNA diagnosis of a single-gene disorder from maternal plasma

The Lancet

2000

;

356

1170

10.1016/S0140-6736(00)02767-7

Google Scholar

10.1016/j.csbj.2020.09.003

Rabinowitz

Shomron

Genome-wide noninvasive prenatal diagnosis of monogenic disorders: Current and future trends

Comput Struct Biotechnol J

2020

;

2463

–

YMD

Chan

KCA

Sun

. et al.

Maternal plasma DNA sequencing reveals the genome-wide genetic and mutational profile of the Fetus

Sci Transl Med

2010

;

61ra91

–

10.1126/scitranslmed.3001720

Fan

Wang

. et al.

Noninvasive prenatal measurement of the Fetal genome

Nature

2012

;

487

320

–

Kitzman

Snyder

Ventura

. et al.

Non-invasive whole genome sequencing of a human fetus

Sci Transl Med

2012

;

137ra76

10.1126/scitranslmed.3004323

10.

Chan KCA, Jiang P, Sun K. et al.

Second generation noninvasive fetal genome analysis reveals de novo mutations, single-base parental inheritance, and preferred DNA ends

Proc Natl Acad Sci USA

2016;

113

:E8159–68.

10.1073/pnas.1615800113

11.

Brand

Whelan

Duyzend

. et al.

High-resolution and noninvasive Fetal exome screening

N Engl J Med

2023

;

389

2014

–

12.

Miceikaitė

Hao

Brasch-Andersen

. et al.

Comprehensive noninvasive Fetal screening by deep trio-exome sequencing

N Engl J Med

2023

;

389

2017

–

13.

Rabinowitz T, Polsky A, Golan D. et al. .

Bayesian-based noninvasive prenatal diagnosis of single-gene disorders

Genome Res

2019;

:428–38.

10.1101/gr.235796.118

10.1016/j.csbj.2020.12.032

14.

Rabinowitz

Deri-Rozov

Shomron

Improved noninvasive fetal variant calling using standardized benchmarking approaches

Comput Struct Biotechnol J

2021

;

509

–

15.

Jumper

Evans

Pritzel

. et al.

Highly accurate protein structure prediction with AlphaFold

Nature

2021

;

596

583

–

10.1038/s41586-021-03819-2

16.

McKenna

Hanna

Banks

. et al.

The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data

Genome Res

2010

;

1297

–

303

10.1101/gr.107524.110

17.

Poplin R, Chang PC, Alexander D. et al.

A universal SNP and small-indel variant caller using deep neural networks

Nat Biotechnol

2018

;

:983–87.

10.1038/nbt.4235

Google Scholar

. https://google.github.io/deepvariant/posts/2019-09-10-twenty-is-the-new-thirty-comparing-current-and-historical-wgs-accuracy-across-coverage/

18.

Chang

P-C

Twenty is the new thirty - comparing current and historical WGS accuracy across coverage

DeepVariant Blog

2019

Google Scholar

10.1016/j.ymgme.2007.09.015

19.

Angle

Burton

Risk of sudden death and acute life-threatening events in patients with glutaric acidemia type II

Mol Genet Metab

2008

;

–

20.

DeepTrio

: variant calling in families using deep learning. bioRxiv. 2021 Apr 5;2021.04.05.438434v1.

10.1101/2021.04.05.438434