Abstract

Motivation

Recommendations on the use of genomics for pathogens surveillance are evidence that high-throughput genomic sequencing plays a key role to fight global health threats. Coupled with bioinformatics and other data types (e.g., epidemiological information), genomics is used to obtain knowledge on health pathogenic threats and insights on their evolution, to monitor pathogens spread, and to evaluate the effectiveness of countermeasures. From a decision-making policy perspective, it is essential to ensure the entire process’s quality before relying on analysis results as evidence. Available workflows usually offer quality assessment tools that are primarily focused on the quality of raw NGS reads but often struggle to keep pace with new technologies and threats, and fail to provide a robust consensus on results, necessitating manual evaluation of multiple tool outputs.

Results

We present PathoSeq-QC, a bioinformatics decision support workflow developed to improve the trustworthiness of genomic surveillance analyses and conclusions. Designed for SARS-CoV-2, it is suitable for any viral threat. In the specific case of SARS-CoV-2, PathoSeq-QC: (i) evaluates the quality of the raw data; (ii) assesses whether the analysed sample is composed by single or multiple lineages; (iii) produces robust variant calling results via multi-tool comparison; (iv) reports whether the produced data are in support of a recombinant virus, a novel or an already known lineage. The tool is modular, which will allow easy functionalities extension.

Availability and implementation

PathoSeq-QC is a command-line tool written in Python and R. The code is available at https://code.europa.eu/dighealth/pathoseq-qc.

1 Introduction

Technology development in the last decades has allowed the extensive production of omics biology data (Dai and Shen 2022). This has played a crucial role in the rapid advancement of comprehensive analysis and identification of complex biological patterns. The importance of next-generation sequencing (NGS) and genomics became particularly manifest during the coronavirus disease 2019 (COVID-19) pandemic to provide evidence in support to decision making and vaccine and treatment (Cen et al. 2023). NGS was instrumental to quickly obtain raw genomic sequences from SARS-CoV-2, the causing disease virus, which in turn were processed by bioinformatics tools to reconstruct (i.e., assembly) the final sequence of the virus (Wu et al. 2020, Zhou et al. 2020, Mercer and Salit 2021). Eventually, the global sharing of these genomic data marked a crucial milestone in combating the pandemic (Harrison et al. 2021, Khare et al. 2021).

One of the most concerning risks in case of urgent need of rapid production and sharing of genomic data is that quality of the data might be not always guaranteed (WHO 2022), or that rigorous quality assessments are often not consistently implemented or, if so, they are not publicly available (Jacot et al. 2021). This lack of assurance hinders the trustworthiness and comparison of downstream analyses (Zufan et al. 2023). For example, in the case of SARS-CoV-2, consensus of lineages genomic sequences and variants of concern (VOCs) are defined according to the presence of nucleotide mutations identified, via a variant calling process, with respect to the Wuhan reference sequence. In addition, identified mutations and VOCs can be used by epidemiological modelling to understand the spread of infectious disease through populations. Moreover, this variant calling process depends on the chosen bioinformatics software and corresponding parameters (Garcia-Prieto et al. 2022), and the quality of the input raw genomic data. Consequently, in a decision-making context based on scientific evidence, such as policymaking and public health, it is of upmost importance to accurately assess and share the quality of the used genomic data for genomic surveillance of pathogens with pandemic and epidemic potential (WHO 2023).

It exists a wide array of tools specifically designed to assess the quality of genomic raw reads [for a representative list see (Expósito et al. 2020)]. Additionally, many attempts have been made to create SARS-CoV-2 analyses pipelines that incorporate raw reads quality control (QC) evaluations, such as the widely used per-base sequence quality, per-sequence quality score and sequence length distribution (Oliveira et al. 2022, Jalal et al. 2023). However, while these tools efficiently determine the quality of raw reads data, only a subset of workflows investigate the quality of the pipeline’s results and evaluate the sample’s genomic homogeneity, which might strongly influence critical analyses steps, such as variant calling and, therefore, lineage assignment. Moreover, pipelines usually focus on lineage assignment only and do not extend their functionalities to additional analyses, such as the identification of subgenomic RNAs (sgRNAs), which are hypothesized to provide indications of active viral replication (Chen et al. 2022). On the other hand, tools able to perform specific analyses, often do not implement sufficient raw data QC evaluation, such as in the case of the Periscope tool for sgRNA identification (Parker et al. 2021). Actually, raw data QC is a critical responsibility of the researcher and should be conducted during the initial stages prior to analysis with these specific bioinformatics tools. On top of the limitations mentioned, there is a pressing need for multifunctional integrated tools that can accommodate emerging technology requirements and novel threats, rather than relying on the current practice of evaluating multiple tool outputs to reach a consensus on results, highlighting the need for more streamlined and integrated approaches.

To address the limitations and needs outlined above, we here introduce PathoSeq-QC, a command-line bioinformatics pipeline developed in Python and R, structured as a sequential workflow, that comprehensively analyses raw NGS data in FASTQ format. PathoSeq-QC is highly customizable, leveraging up to 15 different tools, including raw reads QC evaluation, reads filtering, reads mapping, BAM preprocessing, variant calling, variant annotation and lineage assignment, among others, facilitating the analysis of pathogenic virus NGS data. Although conceptualized and crafted during the pandemic for SARS-CoV-2, the pipeline is versatile and adaptable to other viral pathogens, much like a living document that evolves with new knowledge. It has been successfully tested on A(H5N1) and Oropouche viruses. Additionally, it is particularly suitable for large-scale data analysis due to its multithreaded architecture and optimized use of computational resources. PathoSeq-QC is accessible at https://code.europa.eu/dighealth/pathoseq-qc.

2 Methods

2.1 PathoSeq-QC strategy

PathoSeq-QC workflow takes as input NGS raw data from samples in which viral pathogens are expected to be present, such as from clinical samples. Supplementary Figure S1 provides the list of steps with the corresponding tools implemented in PathoSeq-QC. Each analysis is a separate module, making it easy to add new functionalities by simply adding other modules. Originally developed for SARS-CoV-2 PathoSeq-QC can also be used with other viral pathogens. For example, we successfully tested it on A(H5N1) (run id: SRR29851702) and Oropouche (run id: SRR14711849) viruses (see Supplementary Fig. S6).

PathoSeq-QC is designed to perform three main tasks on NGS raw data from viral pathogens sequenced with Illumina paired-end technology (either via shotgun or via amplicon-based methods). These three tasks are:

  1. Evaluation of raw data quality. PathoSeq-QC assesses the overall quality of raw sequencing data, including sequence coverage, depth, and quality issues (Petrackova et al. 2019) that may affect downstream analyses. It uses tools like fastp (Chen et al. 2018), samtools depth, and samtools coverage (Danecek et al. 2021) for this purpose. Additionally, PathoSeq-QC corrects for sequencing artefacts using the GATK best practice workflow, including mapping data recalibration and deduplication (DePristo et al. 2011).

  2. Analysis of genomic homogeneity. PathoSeq-QC evaluates the genomic homogeneity of the analysed sample. This is crucial, as any detected heterogeneity can be indicative of contamination and/or co-infections in human samples or, in the case of wastewater samples, a deliberate mixture.

  3. Designation and classification. Pathogen lineage nomenclature, or the designation of epidemiologically distinct groups below the level of species, is essential for effective research, treatment, and communication about diseases. To ensure robust lineage designation, PathoSeq-QC implements three variant callers—GATK HaplotypeCaller (Poplin et al. 2017), LoFreq (Wilm et al. 2012) and iVar (Grubaugh et al. 2019) embedded in Freyja (Karthikeyan et al. 2022)—to increase the confidence of identified mutations.

In the case of SARS-CoV-2, the Freyja tool is used in both steps 2 and 3, while Pangolin (O’Toole et al. 2021) and Virstrain (Liao et al. 2022) are implemented to assesses whether the viral pathogen can be confidently assigned to a single known subtype (lineage or clade) in step 3. Additionally, a custom procedure is implemented to identify novel (i.e. not yet classified) and recombinant SARS-CoV-2 lineages generated in-silico (described in Supplementary Section S2). For A(H5N1) virus, LABEL (https://wonder.cdc.gov/amd/flu/label/), and GenoFlu (Youk et al. 2023) are used in step 3. PathoSeq-QC offers also a suite of analyses, including subgenomic RNAs (sgRNAs) discovery, variant calling and variant annotation.

2.2 Installation and dependencies

PathoSeq-QC is written in Python v3.7 and R v4.2 and is designed for Unix-like operating systems (OS) (tested on Ubuntu-20.04.4 LTS). It leverages multithreading and is compatible with High Performance Computing (HPC) infrastructures [tested on the Joint Research Centre Big Data Analytics Platform—BDAP—infrastructure based on HTCondor (Erickson et al. 2018)].

Dependencies and software are managed with conda and installed on separate environments to ensure precise version tracking, reproducibility, and prevent dependency issues or software incompatibilities. Additionally, this approach allows for high modularization of the tool as presented in Supplementary Fig. S1. Installation is automated via a script, and full instructions on how to install and run PathoSeq-QC, explanation of input and output formats as well as examples are provided in the README file included in the repository.

2.3 Performances and flexibility

The details of PathoSeq-QC flexibility, performances, and computational utilization and efficiency are provided as Supplementary Material (Supplementary Sections S3–S6).

2.4 Use case dataset

Raw NGS reads data belonging to the representative SARS-CoV-2 sequences were obtained from the COVID-19 data portal (https://www.covid19dataportal.org/search/sequences? crossReferencesOption=all&overrideDefaultDomain=true&db=representative-sequences&size=15).

3 Results

3.1 Data flexibility

PathoSeq-QC was tested on shotgun and amplicon in-silico datasets to assess its performances. The details of testing are provided as Supplementary Material (Supplementary Section S4). Overall, our tool showed good performances scores (mean values of precision: 0.91, accuracy: 0.85, recall: 0.78, Fscore: 0.86 for shotgun in-silico NGS data). We also benchmarked PathoSeq-QC against V-pipe 3.0 (Fuhrmann et al. 2024), one of the most recently published computational pipeline designed for analysing NGS data of short viral genomes, using high-coverage (>1000x) NGS data. Our tool showed slightly higher performance compared to V-pipe 3.0, though the difference was not statistically significant (Supplementary Fig. S5). Importantly, PathoSeq-QC includes downstream lineage designation steps, a feature not available in V-pipe 3.0, making it a more comprehensive alternative for SARS-CoV-2 analyses. Versatility has been successfully verified by running PathoSeq-QC on raw data from viral isolates genomes of A(H5N1) and Oropouche viruses. Examples of output files on these viral species are provided in the Supplementary Section S6. A detailed guide on how to interpret the output files can be found in the README file in the PathoSeq-QC repository. The datasets used to evaluate the performances are all freely available (DOIs in the Supplementary Material).

3.2 Computational performance evaluation

We tested the computational performance of PathoSeq-QC in two scenarios: using a relatively small dataset and a larger one. For the first test, we first generated a small in-silico dataset consisting of 40 000 paired-end reads (in FASTQ format) using a true recombinant sequence (SARS-CoV-2 XD lineage). In a Desktop Linux environment and with 4 CPUs (see Supplementary Section S3), PathoSeq-QC required a maximum of 3 GB of RAM memory and was able to perform all non-optional modules and to correctly assign the test dataset to the XD lineage (e.g. QC, variant calling and lineage assignment steps) in 5 min (Supplementary Fig. S3). For the second test, which involved also the bbmap and Freyja optional steps, we used a bigger dataset consisting of 472 109 paired-end reads. This time, PathoSeq-QC took approximately 14 min using the Linux machine as before and a total of 10 CPUs.

3.3 Use case

We evaluated PathoSeq-QC’s ability to identify quality deficiencies in publicly available SARS-CoV-2 datasets, specifically the COVID-19 Data Portal (Harrison et al. 2021). We analysed the representative lineage sequences dataset (see Section 2). Out of 224 samples, only 86 samples had available raw data (in FASTQ format). Although reads quality was not an issue, 38 samples (44%) had low mean sequence coverage (<20x) and 39 samples (45%) showed signs of high genomic heterogeneity (Supplementary Figs S7 and S8). Since genomic heterogeneity might strongly affect variant calling results, we manually inspected the PathoSeq-QC’s variant calls of three randomly selected samples (run ids: ERR6136423, ERR6187498, ERR7541889) and compared them with their corresponding assembled sequences on the COVID-19 data portal. We observed nine variants with low allele frequencies (Supplementary Table S1), indicating high genomic heterogeneity (Boscolo Bielo et al. 2023). Additionally, five variants were identified by iVar but not by GATK and LowFreq tools (Supplementary Table S2). To determine whether this was due to sensitivity limitations of the tools or potential errors in the assembled sequence, we manually inspected the five variants in question. Variant 21987: G-A in samples with run ids: ERR6136423 and ERR6187498 failed iVar’s internal QCs, suggesting that it may be a technical and/or assembly error. The other four variants (8835: T-C and 25350: C-T for run id: ERR7541889, 24410: G-A for run id: ERR6187498 and 28461: A-G for run id: ERR618798) had low AF values (≤0.5) according to iVar results, supporting the hypothesis of high heterogeneity in this sample. Moreover, iVar and GATK detected variant 29742: G-T, which is present in many VOCs, but was poorly supported by robust reads evidence (i.e. low reads count). Overall, these results highlight the importance of robust quality assessments in variant calling and lineage assignment.

3.4 Considerations on deduplication

PathoSeq-QC includes a deduplication step, a bioinformatics procedure that removes reads likely originating from the same genetic fragment during sequencing (e.g., due to PCR amplification artefacts). Deduplication can significantly influence variant calling by different read support counts, ultimately affecting the final set of variants on lineages sequences. To illustrate this, we manually inspected the mapping data of a single sample (run id: ERR7541889) prior and after deduplication. We focused on variant 29742: G-T, that was previously detected with insufficient read support. Without deduplication, we observed 171 supporting reads, but only two remained after deduplication (1.17%), negatively impairing the quality of the called variant (Supplementary Fig. S9).

4 Conclusions

Genomics can provide new insights into health pathogenic threats such as viruses, their evolution (Markov et al. 2023), and their spread. Additionally, genomics can inform decision-making on countermeasures to control outbreaks. However, there is a current need for a unified, best-practices approach to bioinformatics among laboratories working on a common outbreak problem (Foster et al. 2022). This approach also requires publicly available and standardized benchmark data to support accurate and timely outbreak investigation and surveillance (Xiaoli et al. 2022). Recently, Connor and colleagues issued a series of relevant recommendations on this topic (Connor et al. 2024). PathoSeq-QC represents a first attempt to implement these recommendations, fortifying high-throughput genomic sequencing as an integral component of routine pathogen surveillance. By establishing quality criteria for the global sharing of data and results, PathoSeq-QC maintains data integrity to mitigate the risk of erroneous conclusions that could influence decision-making in critical scenarios.

Our assessment of PathoSeq-QC performance, highlights the importance of quality evaluation at different analyses steps to prevent the discovery of erroneous nucleotide mutations and ensure accurate final genomic sequences (Supplementary Fig. S9). For example, we identified samples with low genomic homogeneity, which might strongly impact downstream analyses. These samples may have been contaminated during sampling or preprocessing of clinical samples, making our findings particularly relevant given the dataset’s intended use as a reference for SARS-CoV-2 lineages.

PathoSeq-QC's adaptable architecture represents a step forward in elevating the adoption of reliable and robust bioinformatics pipeline for effective pathogen genomic surveillance on a global scale. An added value of PathoSeq-QC is its ability to handle NGS raw data from complex matrices, such as wastewater samples (manuscript in preparation). Due to its modular design, we envision smooth future implementation of this new feature in PathoSeq-QC’s workflow. Furthermore, our next steps will also focus on integrating the use of raw data from emerging sequencing technologies (e.g., nanopore) and from various pathogens, including novel health threats, for which we will leverage our work with the A(H5N1) and Oropouche viruses.

Acknowledgements

We would like to express our gratitude to the reviewers for their constructive feedback, suggestions, and comments that helped refining our manuscript. We also acknowledge the support of the European Commission's Joint Research Centre (JRC), where this research was conducted. We are gratefully to the originating laboratories where the clinical specimens or virus isolates were first obtained, as well as the submitting laboratories where sequence data were generated and submitted to GISAID and to the International Nucleotide Sequence Database Collaboration (INSDC) platforms.

Author contributions

Gabriele Leoni (Conceptualization [lead], Data curation [lead], Formal analysis [lead], Methodology [lead], Software [lead], Validation [lead]), Mauro Petrillo (Conceptualization [equal], Data curation [supporting], Formal analysis [supporting], Investigation [equal], Methodology [equal], Resources [equal], Validation [supporting]), Victoria Ruiz-Serra (Software [supporting]), Maddalena Querci (Conceptualization [supporting], Funding acquisition [supporting], Project administration [supporting], Supervision [supporting]), Sandra Coecke (Funding acquisition [supporting], Project administration [lead], Supervision [lead]), and Tobias Wiesenthal (Funding acquisition [lead], Project administration [lead], Supervision [lead])

Supplementary data

Supplementary data are available at Bioinformatics online.

Conflict of interest: None declared.

Funding

This research received no external funding.

Data availability

The data underlying this article are available in the article and in its online supplementary material.

References

Boscolo Bielo
L
,
Trapani
D
,
Repetto
M
 et al.  
Variant allele frequency: a decision-making tool in precision oncology?
 
Trends Cancer
 
2023
;
9
:
1058
68
.

Cen
X
,
Wang
F
,
Huang
X
 et al.  
Towards precision medicine: omics approach for COVID-19
.
Biosaf Health
 
2023
;
5
:
78
88
.

Chen
S
,
Zhou
Y
,
Chen
Y
 et al.  
fastp: an ultra-fast all-in-one FASTQ preprocessor
.
Bioinformatics
 
2018
;
34
:
i884
90
.

Chen
Z
,
Ng
RWY
,
Lui
G
 et al.  
Profiling of SARS-CoV-2 subgenomic RNAs in clinical specimens
.
Microbiol Spectr
 
2022
;
10
:
e00182
-
22
.

Connor
R
,
Shakya
M
,
Yarmosh
DA
 et al.  
Recommendations for uniform variant calling of SARS-CoV-2 genome sequence across bioinformatic workflows
.
Viruses
 
2024
;
16
:
430
.

Dai
X
,
Shen
L.
 
Advances and trends in omics technology development
.
Front. Med
 
2022
;
9
:
911861
.

Danecek
P
,
Bonfield
JK
,
Liddle
J
 et al.  
Twelve years of SAMtools and BCFtools
.
Gigascience
 
2021
;
10
:
giab008
.

DePristo
MA
,
Banks
E
,
Poplin
R
 et al.  
A framework for variation discovery and genotyping using next-generation DNA sequencing data
.
Nat Genet
 
2011
;
43
:
491
8
.

Erickson
RA
,
Fienen
MN
,
McCalla
SG
 et al.  
Wrangling distributed computing for high-throughput environmental science: an introduction to HTCondor
.
PLoS Comput Biol
 
2018
;
14
:
e1006468
.

Expósito
RR
 et al.  
SeQual: big data tool to perform quality control and data preprocessing of large NGS datasets
.
IEEE Access
 
2020
;
8
:
146075
84
.

Foster
CSP
,
Stelzer-Braid
S
,
Deveson
IW
 et al.  
Assessment of inter-laboratory differences in SARS-CoV-2 consensus genome assemblies between public health laboratories in Australia
.
Viruses
 
2022
;
14
:
185
.

Fuhrmann
L
,
Jablonski
KP
,
Topolsky
I
 et al.  
V-pipe 3.0: a sustainable pipeline for within-sample viral genetic diversity estimation
.
Gigascience
 
2024
;
13
:
giae065
.

Garcia-Prieto
CA
,
Martínez-Jiménez
F
,
Valencia
A
 et al.  
Detection of oncogenic and clinically actionable mutations in cancer genomes critically depends on variant calling tools
.
Bioinformatics
 
2022
;
38
:
3181
91
.

Grubaugh
ND
,
Gangavarapu
K
,
Quick
J
 et al.  
An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar
.
Genome Biol
 
2019
;
20
:
8
.

Harrison
PW
,
Lopez
R
,
Rahman
N
 et al.  
The COVID-19 data portal: accelerating SARS-CoV-2 and COVID-19 research through rapid open access data sharing
.
Nucleic Acids Res
 
2021
;
49
:
W619
23
.

Jacot
D
,
Pillonel
T
,
Greub
G
 et al.  
Assessment of SARS-CoV-2 genome sequencing: quality criteria and low-frequency variants
.
J Clin Microbiol
 
2021
;
59
:
e0094421
.

Jalal
D
,
Samir
O
,
Elzayat
MG
 et al.  
Genomic characterization of SARS-CoV-2 in Egypt: insights into spike protein thermodynamic stability
.
Front Microbiol
 
2023
;
14
:
1190133
.

Karthikeyan
S
,
Levy
JI
,
De Hoff
P
 et al.  
Wastewater sequencing reveals early cryptic SARS-CoV-2 variant transmission
.
Nature
 
2022
;
609
:
101
8
.

Khare
S
,
Gurry
C
,
Freitas
L
 et al.  
GISAID’s role in pandemic response
.
China CDC Wkly
 
2021
;
3
:
1049
51
.

Liao
H
,
Cai
D
,
Sun
Y
 et al.  
VirStrain: a strain identification tool for RNA viruses
.
Genome Biol
 
2022
;
23
:
38
.

Markov
PV
,
Ghafari
M
,
Beer
M
 et al.  
The evolution of SARS-CoV-2
.
Nat Rev Microbiol
 
2023
;
21
:
361
79
.

Mercer
TR
,
Salit
M.
 
Testing at scale during the COVID-19 pandemic
.
Nat Rev Genet
 
2021
;
22
:
415
26
.

Oliveira
RRM
,
Costa Negri
T
,
Nunes
G
 et al.  
PipeCoV: a pipeline for SARS-CoV-2 genome assembly, annotation and variant identification
.
PeerJ
 
2022
;
10
:
e13300
.

O’Toole
Á
 et al.  
Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool
.
Virus Evol
 
2021
;
7
:
veab064
.

Parker
MD
,
Lindsey
BB
,
Leary
S
 et al. ;
COVID-19 Genomics UK (COG-UK) Consortium
.
Subgenomic RNA identification in SARS-CoV-2 genomic sequencing data
.
Genome Res
 
2021
;
31
:
645
58
.

Petrackova
A
,
Vasinek
M
,
Sedlarikova
L
 et al.  
Standardization of sequencing coverage depth in NGS: recommendation for detection of clonal and subclonal mutations in cancer diagnostics
.
Front Oncol
 
2019
;
9
:
851
.

Poplin
R
, Ruano-Rubio V, DePristo MA et al. Scaling accurate genetic variant discovery to tens of thousands of samples. BioRxiv, 2017, 20
1178
.

WHO
.
Guiding Principles for Pathogen Genome Data Sharing
.
Geneva
:
World Health Organization
;
2022
.

WHO
.
Global Genomic Surveillance Strategy for Pathogens with Pandemic and Epidemic Potential 2022–2032: Progress Report on the First Year of Implementation
.
Geneva
:
World Health Organization
;
2023
.

Wilm
A
,
Aw
PPK
,
Bertrand
D
 et al.  
LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets
.
Nucleic Acids Res
 
2012
;
40
:
11189
201
.

Wu
F
,
Zhao
S
,
Yu
B
 et al.  
A new coronavirus associated with human respiratory disease in China
.
Nature
 
2020
;
579
:
265
9
.

Xiaoli
L
,
Hagey
JV
,
Park
DJ
 et al.  
Benchmark datasets for SARS-CoV-2 surveillance bioinformatics
.
PeerJ
 
2022
;
10
:
e13821
.

Youk
S
,
Torchetti
MK
,
Lantz
K
 et al.  
H5N1 highly pathogenic avian influenza clade 2.3.4.4b in wild and domestic birds: introductions into the United States and reassortments, December 2021–April 2022
.
Virology
 
2023
;
587
:
109860
.

Zhou
P
,
Yang
X-L
,
Wang
X-G
 et al.  
A pneumonia outbreak associated with a new coronavirus of probable bat origin
.
Nature
 
2020
;
579
:
270
3
.

Zufan
SE
,
Lau
KA
,
Donald
A
 et al.  
Bioinformatic investigation of discordant sequence data for SARS-CoV-2: insights for robust genomic analysis during pandemic surveillance
.
Microb Genom
 
2023
;
9
:
001146
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Associate Editor: Can Alkan
Can Alkan
Associate Editor
Search for other works by this author on:

Supplementary data