An amplicon-based protocol for whole-genome sequencing of human respiratory syncytial virus subgroup A

Abstract

It is convenient to study complete genome sequences of human respiratory syncytial virus (hRSV) for ongoing genomic characterization and identification of highly transmissible or pathogenic variants. Whole genome sequencing of hRSV has been challenging from respiratory tract specimens with low viral loads. Herein, we describe an amplicon-based protocol for whole genome sequencing of hRSV subgroup A validated with 24 isolates from nasopharyngeal swabs and infected cell cultures, which showed cycle threshold (Ct) values ranging from 10 to 31, as determined by quantitative reverse-transcription polymerase chain reaction. MinION nanopore generated 3200 to 5400 reads per sample to sequence over 93% of the hRSV-A genome. Coverage of each contig ranged from 130× to 200×. Samples with Ct values of 20.9, 25.2, 27.1, 27.7, 28.2, 28.8, and 29.6 led to the sequencing of over 99.0% of the virus genome, indicating high genome coverage even at high Ct values. This protocol enables the identification of hRSV subgroup A genotypes, as primers were designed to target highly conserved regions. Consequently, it holds potential for application in molecular epidemiology and surveillance of this hRSV subgroup.

Introduction

Human respiratory syncytial virus (hRSV) is an enveloped single-stranded negative sense RNA virus that belongs to the Pneumoviridae family, genus Orthopneumovirus [1]. Since its isolation from human infants with respiratory illness in 1956 [2, 3], it has been recognized as a relevant pathogen for young children. In 2019, the global burden of acute lower respiratory infections in infants aged 0–60 months was estimated to be 33 million, with 3.6 million hospital admissions and 26 300 in-hospital deaths. Notably, 50% of deaths occurred in the group aged 0–6 months [4]. hRSV infection in the first year of life increases significatively the risk to develop asthma-like symptoms; however, the underlying mechanisms are still not clear but probably include virological and host-related factors [5]. Immunocompromised subjects and the elderly are also highly affected by hRSV, although a vaccine has been recently approved for the latter group [6–9].

Virus genome size ranges from 15 191 to 15 227 nucleotides [10] and includes 10 genes (NS1, NS2, N, P, M, SH, G, F, M2, and L) encoding 11 proteins, since the M2 gene has two open reading frames. Antigenic variability initially evaluated with monoclonal antibodies allowed the classification of hRSV in subgroups A and B [11, 12]. Both subgroups normally co-circulate, although hRSV subgroup A (hRSV-A) has been found to be prevalent in many regions [13–17], whereas hRSV-B was found predominant in the elderly in one study [16]. Some studies have identified a link between severe disease and hRSV-A, while others have indicated that the viral subgroup is not associated with severity [14, 18–20]. Further studies are warranted to determine if there exist differences in pathogenicity between hRSV-A and hRSV-B.

Genome sequencing has identified a distinctive higher variability in the gene encoding the attachment G glycoprotein. According to this variability, hRSV has been categorized into genotypes, with over 20 for hRSV-A and up to 36 for hRSV-B [21–23], although alternative classifications have been proposed [24]. However, there is still no consensus on criteria for defining hRSV genotypes, which preferably should consider complete genomes, as some variants with identical G gene sequence have shown significant changes in other genes [22, 25]. Characterizing hRSV variants is relevant for understanding the molecular basis underlying genetic diversity, implicating natural immunity, prophylaxis with monoclonal antibodies, and antiviral therapies. Additionally, it may contribute to the intelligent design of drugs and vaccines [25].

Whole genome sequencing of hRSV has been challenging, given that viral loads in respiratory samples are not always sufficient to achieve complete genome amplification [26, 27]. There are protocols designed for covering the hRSV genome with 6–10 amplicons, which show high-performance sequencing of samples with viral loads associated with cycle threshold (Ct) values ≤27 [26, 28, 29].

Herein, we describe an amplicon-based protocol for whole hRSV-A genome sequencing isolated from nasopharyngeal swabs and infected cell cultures with Ct values from 10 to 31.

Materials and methods

Clinical samples

Nasopharyngeal swabs were collected from patients with moderate to severe respiratory illness attending to the Instituto Nacional de Enfermedades Respiratorias Ismael Cosio Villegas (INER), Mexico City, from October 2022 to March 2023 and were evaluated by routine molecular testing to identify respiratory pathogens, by using the FilmArray Respiratory Panel kit (Biomérieux). Samples positive for hRSV-A were identified using one-step quantitative reverse-transcription–polymerase chain reaction (RT-qPCR) and a TaqMan assay, as outlined in the subsequent sections. For this study, we selected a batch of 22 nasopharyngeal swabs positive for hRSV-A preserved at −80°C. The study was approved by the Committees of Science, Biosecurity, and Bioethics of the INER (protocol number B10-20) and of the Faculty of Medicine of the National Autonomous University of Mexico (protocol number FM/DI/064/2020).

Cell culture and hRSV infection

The mouse macrophage cell line (P388D1) has remained infected by the hRSV subgroup A strain Long (VR-26, ATCC), for over 150 passages and it is maintained in RPMI-1640 supplemented with 5% fetal bovine serum (Biowest), 1% penicillin-streptomycin (Invitro) and 1 µM 2-mercaptoethanol (Sigma), at 37°C and 5% CO₂. This persistently hRSV-infected macrophage cell line (piMø) has been previously characterized [30, 31]. The original P388D1 macrophages are maintained under the same conditions and are used as mock-infected cells.

The Vero cell line was grown in Dulbecco’s Modified Eagle Medium, supplemented with 5% fetal bovine serum, 10 mM HEPES (SIGMA), and 1% penicillin-streptomycin (Invitro). Vero cells were seeded in a Petri dish (1 × 10⁶/ml) and infected with hRSV-A strain Long at multiplicity of infection (moi) of 1 for 24 h. Total RNA was isolated from infected piMø and Vero cells as described in the following section.

RNA extraction

Viral RNA was extracted from 280 µl of nasal swabs, using QIAamp Viral RNA mini kit (QIAGEN) according to the manufacturer’s instructions, excepting that the sample was divided in two aliquots of 140 µl and treated as independent samples before they were added consecutively to the spin-column. Duplicating the sample input volume resulted in a 2.4-fold increase in the yield of total RNA (Supplementary Material S1).

As positive controls of hRSV-A infection, total RNA was extracted from piMø and Vero cells infected at moi of 1 for 24 h, by using the RNA purification kit GeneAll Hybrid-R, with spin-column format (GeneAll Biotechnology), according to the manufacturer’s instructions. Another control included was total RNA from non-infected macrophages. RNA was quantified using a NanoDrop spectrophotometer ND-2000 (Thermo Scientific).

One-step quantitative RT-PCR

To estimate hRSV-A viral loads in nasopharyngeal swabs and infected cell cultures, Ct values were determined by one-step RT-qPCR. Reactions were performed with 50 ng of RNA and the SuperScript III Platinum One-Step qRT-PCR kit (Invitrogen), according to the manufacturer’s instructions. This assay was performed with the TaqMan probe against the nucleoprotein gene (N), with sequence 5′-GGCAATGCTGCTGGCCTAGGCATAATG-3′ at a final concentration of 100 nM; also the primers, Fw 5′-AATTTCCTCATTTTCCAGTGTAG-3′ and Rv 5′-TGATTCCTCGGTGTACCTCTG-3′, at final concentration of 400 nM. The following cycling conditions were programed in the Thermocycler Step-One Plus Real Time PCR system (Applied Biosystems): one cycle of 50°C for 15 min, one cycle of 95°C for 2 min, followed by 40 cycles of 95° C for 15 s and 60°C for 30 s.

Additionally, absolute quantification of virus genome copies was performed with a plasmid DNA standard curve that includes as a template the same 85 bp region from the hRSV N gene targeted by the described probe and primers. Standard curve included 10-fold serial dilutions starting with 1 × 10⁷ N gene copies. Mean Ct values were plotted versus the log10 of standard concentrations and equation was: $y = - 2.986 \log (x) + 42.265; R^{2} = 0.98$ ⁠.

Primer design

A total of 2050 complete or partial hRSV-A sequences were downloaded from GenBank; alignment was performed considering four reference strains (AY911262.1, KF530268.1, KM042390.1, and MT492011.1) in MAFFT (v.7.475) and visualized in Unipro UGENE. Primers were designed targeting 34 regions broadly conserved. The average amplicon size was 1261 bp, with a length of 705 bp for the shortest and 1612 bp for the longest (Table 1). The overlap between adjacent amplicons was on average 281 bp (Fig. 1A). These 17 amplicons allow amplification and complete sequencing of the hRSV-A genome.

Figure 1.

Schematic protocol for amplification and sequencing of the hRSV genome. (A) hRSV-A genome was amplified from nucleotide 1 to 15 181 by using the 17-primer panel from Table 2. (B) Schematic diagram to show sample processing and virus genome amplification for sequencing on Nanopore platform. Figure was created in Biorender.

Open in new tab Download slide

Table 1.

Open in new tab

hRSV-A genome copies in the analyzed nasopharyngeal swabs and infected cells.

Sample	Ct	Genome copies (log₁₀)	Sample	Ct	Genome copies (log₁₀)
1	27.7	4.9	13	28.7	4.5
2	28.8	4.5	14	29.6	4.2
3	30.1	4.1	15	28.2	4.7
4	27.3	5.0	16	25.2	5.7
5	28.2	4.7	17	27.6	4.9
6	20.9	7.1	18	28.9	4.5
7	31.2	3.7	19	27.1	5.1
8	28.8	4.5	20	22.5	6.6
9	30.0	4.1	21	23.9	6.1
10	29.9	4.1	22	20.9	7.1
11	25.5	5.6	23*	12.0	10.1
12	25.8	5.5	24*	12.3	10.0

Sample	Ct	Genome copies (log₁₀)	Sample	Ct	Genome copies (log₁₀)
1	27.7	4.9	13	28.7	4.5
2	28.8	4.5	14	29.6	4.2
3	30.1	4.1	15	28.2	4.7
4	27.3	5.0	16	25.2	5.7
5	28.2	4.7	17	27.6	4.9
6	20.9	7.1	18	28.9	4.5
7	31.2	3.7	19	27.1	5.1
8	28.8	4.5	20	22.5	6.6
9	30.0	4.1	21	23.9	6.1
10	29.9	4.1	22	20.9	7.1
11	25.5	5.6	23*	12.0	10.1
12	25.8	5.5	24*	12.3	10.0

Samples 23 and 24 were derived from piMø and Vero cells infected for 24 h, respectively.

Table 1.

Open in new tab

hRSV-A genome copies in the analyzed nasopharyngeal swabs and infected cells.

Sample	Ct	Genome copies (log₁₀)	Sample	Ct	Genome copies (log₁₀)
1	27.7	4.9	13	28.7	4.5
2	28.8	4.5	14	29.6	4.2
3	30.1	4.1	15	28.2	4.7
4	27.3	5.0	16	25.2	5.7
5	28.2	4.7	17	27.6	4.9
6	20.9	7.1	18	28.9	4.5
7	31.2	3.7	19	27.1	5.1
8	28.8	4.5	20	22.5	6.6
9	30.0	4.1	21	23.9	6.1
10	29.9	4.1	22	20.9	7.1
11	25.5	5.6	23*	12.0	10.1
12	25.8	5.5	24*	12.3	10.0

Sample	Ct	Genome copies (log₁₀)	Sample	Ct	Genome copies (log₁₀)
1	27.7	4.9	13	28.7	4.5
2	28.8	4.5	14	29.6	4.2
3	30.1	4.1	15	28.2	4.7
4	27.3	5.0	16	25.2	5.7
5	28.2	4.7	17	27.6	4.9
6	20.9	7.1	18	28.9	4.5
7	31.2	3.7	19	27.1	5.1
8	28.8	4.5	20	22.5	6.6
9	30.0	4.1	21	23.9	6.1
10	29.9	4.1	22	20.9	7.1
11	25.5	5.6	23*	12.0	10.1
12	25.8	5.5	24*	12.3	10.0

Samples 23 and 24 were derived from piMø and Vero cells infected for 24 h, respectively.

hRSV-A genome amplification by two-step RT-PCR

cDNA was synthesized from 1 µg of total RNA by using the Superscript III first-strand synthesis system for RT-PCR (Invitrogen cat. 18080-051). Twenty microliter of RT reactions included: 1 µl of random hexamers, 1 µl of dNTP mix (10 mM), RNA (1 µg), 2 µl of 10× RT buffer, 4 µl of 25 mM MgCl₂, 2 µl of 0.1 M DTT, 1 µl of RNase out, 1 µl of Superscript III and DEPC-treated water if necessary to complete 20 µl. Synthesis was run at 50°C for 50 min and enzyme was subsequently inactivated at 85°C for 5 min. An additional step was performed to eliminate residual RNA by adding 1 µl of RNase H and incubating at 37°C for 20 min.

Genome amplification by conventional PCR was performed in a 50 µl reaction system, containing 2 µl of cDNA, 1 µl of dNTP mix (10 mM), 5 µl of 10× PCR buffer, 1.5 µl of MgCl₂ (50 mM), 1 µl of primer forward (10 µM), 1 µl of primer reverse (10 µM), 0.2 µl (2 U) of Platinum Taq DNA Polymerase (Invitrogen, cat. 10966-026) and DEPC-treated water up to 50 µl. A positive control prepared with cDNA from piMø was always included.

The PCR cycle condition was an initial denaturation at 95°C for 3 min; 40 cycles of 95°C for 30 s, annealing to specific temperature for 30 s and extension at 72°C for 40 s; finally, an extension at 72°C for 10 min.

Primer specificity was evaluated by using cDNA prepared from total RNA of non-infected macrophages. Duplex PCRs were tested by using primer pairs with the same annealing temperature but producing amplicons of different length.

Agarose gel electrophoresis

Amplicons were visualized by agarose gel electrophoresis in 0.8% agarose gels prepared with tris-acetate-EDTA buffer and GelRed (Gold Biotechnology). Both amplicons and DNA ladder (Sigma, cat. D3937) were mixed with 6× loading buffer (30 mM EDTA, 40% glycerol, and 0.2% bromophenol blue) and gels were run at 100 V. DNA bands were visualized in a Gel Documentation System (Major Sciences).

Library preparation and viral genome sequencing

The amplicons were individually purified using the QIAquick PCR Purification Kit (28104, QIAGEN, Hilden, Germany) following the supplier's instructions. Concentrations were determined using a NanoDrop 2000 (Thermo Fisher), dilutions were prepared to pool all amplicons per genome in equal concentration (20 ng/µl). Sequencing libraries were barcoded and multiplexed using the Ligation Sequencing Kit and Native Barcoding Expansion pack (SQK-LSL109 ligation kit and NDB-104, NBD-114 nanopore). A total of 15 ng (24 samples) was loaded onto a MinION R9.4.1 flow cell and sequenced for a total of 16 hours and generated 4.2 million reads.

Sequence data analysis

Following completion of the sequencing runs, fast5 files were basecalled with Guppy (v3.5.1, ONT) using the high accuracy module. Consensus genomes were generated using Canu and Nanopolish per the bioinformatic pipeline [32]. hRSV-A contigs were assembled using as reference the GeneBank sequence AY911262. The pipeline used for the nanopore sequencing is presented in the Supplementary Material S2.

Phylogenetic analysis

To perform phylogenetic analysis, we analyzed complete sequences (n = 17) and selected 307 sequences (185 hRSV-A and 122 hRSV-B) available on the GenBank platform. The selected sequences are from different states in the USA reported in 2022. Sequence alignments were created using MAFFT V7 [33] and edited with MEGA 10.0 [34]. A maximum likelihood tree was constructed for the whole genome sequence using MEGA 10.0. The General Time-Reversible model was selected with five-parameter gamma-distributed rates and 1000 bootstrap replicates. Edition of the trees was made using FigTree [35]. hRSV-A clades were assigned by using the real-time phylogenetic analysis in Nextclade [36] (nextclade) (https://clades.nextstrain.org).

Results

To validate our sequencing protocol, we selected 22 nasopharyngeal swabs that tested positive for hRSV-A, displaying Ct values ranging from 20.9 to 31.2. These values correspond to a spectrum of 7.1 log₁₀ to 3.7 log₁₀ virus genome copies. We also included two samples from piMø and acutely infected Vero cells, which showed Ct values of 12.0 and 12.3, equivalent to ∼10.0 log₁₀ virus genome copies (Table 1).

Viral genome amplification was performed with our 17-primer panel (pair of primers were named with letters A to Q) to amplify the hRSV-A genome from nucleotide 1 to 15 181 in individual reactions to generate amplicons with sizes between 705 and 1612 bp (Table 2). The amplicons generated by the primer pairs P and Q include 3′ and 5′ promoter regions, respectively.

Table 2.

Open in new tab

Characteristics of the primer panel to amplify the hRSV-A genome.

Primer pair (name)	Primer (position Fw, Rv)	Sequence (5′–3′)	T (°C) annealing	Amplicon size
A	43F	TGGGGCAAATAAGAATTTGATAAG	57	1101
A	1144R	CAACTTGACTTTGCTAAGAGC	57	1101
B	827F	RTGAARCTATTGCACAAAGTRGG	58	1409
B	2236R	ATCTTGRTTYCTCGGTGTACCTC	58	1409
C	2018F	ARTGGGWGGAGAAGCWGG	60	1237
C	3255R	CACGTATGTTTCCATATTTGCCC	60	1237
D	2951F	GAAAGYGAAAARATGGCAAAAGACAC	58	1036
D	3987R	CGTGTAGCTGTRTGCTTCC	58	1036
E	3568F	RGCATATGATGTAACYACACC	51	1167
E	4735R	GATTRAGAGTRTCCCAGGTC	51	1167
F	4329F	AATTCWCAAGCAAATTYTGGCC	58	1512
F	5841R	CTYTGCTAACTGCACTRCATG	58	1512
G	5620F	CTGGGGCAAATAACMATGGAG	61	1579
G	7199R	CATTGACTTGAGATATTGATGCATC	61	1579
H	6966F	GAGCYATWGTGTCATGCTATGG	60	705
H	7671R	CTTCGYGACATATTTGCCCC	60	705
I	7384F	TAARGCCARAARCACACCAG	61	1180
I	8564R	AATAATGGGATCCATTTTGTCCC	61	1180
J	8186F	RGATATCCACAARAGCATAACC	59	1612
J	9798R	AAGRTTATTRTCACCTGCAAGC	59	1612
K	9465F	CTTAGGSTTRAGATGYGGATTC	58	1000
K	10465R	GTCTRAACATWCCCGGTTGC	58	1000
L	10212F	YTRATATGGACTAGTTTYCCTAG	54	1546
L	11758R	TTGTCTYTCAGAYCCTAAAGC	54	1546
M	11518F	ATGYTRTTTGGTGGTGGTGATCC	59	1305
M	12823R	GGAGGTTTCATCAAATGTATCTC	59	1305
N	12589F	YTWACAGTCAGTAGTAGACCATG	56	1414
N	14003R	YATGATGCCAAGGAAGCATGC	56	1414
O	13904F	RRATTATAGATCATTCAGGTAATACAG	53	1118
O	15022R	AMACTACCTGTRATTTTAATCAG	53	1118
P	1F	ACGGGAAAAAAWGCGTACWACAAAC	59	1144
P	1144R*	CAACTTGACTTTGCTAAGAGC	59	1144
Q	13904F*	RRATTATAGATCATTCAGGTAATACAG	55	1277
Q	15181R	ATCAAARTGRTAATTTTAGGATTAGTTCAC	55	1277

Primer pair (name)	Primer (position Fw, Rv)	Sequence (5′–3′)	T (°C) annealing	Amplicon size
A	43F	TGGGGCAAATAAGAATTTGATAAG	57	1101
A	1144R	CAACTTGACTTTGCTAAGAGC	57	1101
B	827F	RTGAARCTATTGCACAAAGTRGG	58	1409
B	2236R	ATCTTGRTTYCTCGGTGTACCTC	58	1409
C	2018F	ARTGGGWGGAGAAGCWGG	60	1237
C	3255R	CACGTATGTTTCCATATTTGCCC	60	1237
D	2951F	GAAAGYGAAAARATGGCAAAAGACAC	58	1036
D	3987R	CGTGTAGCTGTRTGCTTCC	58	1036
E	3568F	RGCATATGATGTAACYACACC	51	1167
E	4735R	GATTRAGAGTRTCCCAGGTC	51	1167
F	4329F	AATTCWCAAGCAAATTYTGGCC	58	1512
F	5841R	CTYTGCTAACTGCACTRCATG	58	1512
G	5620F	CTGGGGCAAATAACMATGGAG	61	1579
G	7199R	CATTGACTTGAGATATTGATGCATC	61	1579
H	6966F	GAGCYATWGTGTCATGCTATGG	60	705
H	7671R	CTTCGYGACATATTTGCCCC	60	705
I	7384F	TAARGCCARAARCACACCAG	61	1180
I	8564R	AATAATGGGATCCATTTTGTCCC	61	1180
J	8186F	RGATATCCACAARAGCATAACC	59	1612
J	9798R	AAGRTTATTRTCACCTGCAAGC	59	1612
K	9465F	CTTAGGSTTRAGATGYGGATTC	58	1000
K	10465R	GTCTRAACATWCCCGGTTGC	58	1000
L	10212F	YTRATATGGACTAGTTTYCCTAG	54	1546
L	11758R	TTGTCTYTCAGAYCCTAAAGC	54	1546
M	11518F	ATGYTRTTTGGTGGTGGTGATCC	59	1305
M	12823R	GGAGGTTTCATCAAATGTATCTC	59	1305
N	12589F	YTWACAGTCAGTAGTAGACCATG	56	1414
N	14003R	YATGATGCCAAGGAAGCATGC	56	1414
O	13904F	RRATTATAGATCATTCAGGTAATACAG	53	1118
O	15022R	AMACTACCTGTRATTTTAATCAG	53	1118
P	1F	ACGGGAAAAAAWGCGTACWACAAAC	59	1144
P	1144R*	CAACTTGACTTTGCTAAGAGC	59	1144
Q	13904F*	RRATTATAGATCATTCAGGTAATACAG	55	1277
Q	15181R	ATCAAARTGRTAATTTTAGGATTAGTTCAC	55	1277

These primers are also included in pairs A and O.

Table 2.

Open in new tab

Characteristics of the primer panel to amplify the hRSV-A genome.

Primer pair (name)	Primer (position Fw, Rv)	Sequence (5′–3′)	T (°C) annealing	Amplicon size
A	43F	TGGGGCAAATAAGAATTTGATAAG	57	1101
A	1144R	CAACTTGACTTTGCTAAGAGC	57	1101
B	827F	RTGAARCTATTGCACAAAGTRGG	58	1409
B	2236R	ATCTTGRTTYCTCGGTGTACCTC	58	1409
C	2018F	ARTGGGWGGAGAAGCWGG	60	1237
C	3255R	CACGTATGTTTCCATATTTGCCC	60	1237
D	2951F	GAAAGYGAAAARATGGCAAAAGACAC	58	1036
D	3987R	CGTGTAGCTGTRTGCTTCC	58	1036
E	3568F	RGCATATGATGTAACYACACC	51	1167
E	4735R	GATTRAGAGTRTCCCAGGTC	51	1167
F	4329F	AATTCWCAAGCAAATTYTGGCC	58	1512
F	5841R	CTYTGCTAACTGCACTRCATG	58	1512
G	5620F	CTGGGGCAAATAACMATGGAG	61	1579
G	7199R	CATTGACTTGAGATATTGATGCATC	61	1579
H	6966F	GAGCYATWGTGTCATGCTATGG	60	705
H	7671R	CTTCGYGACATATTTGCCCC	60	705
I	7384F	TAARGCCARAARCACACCAG	61	1180
I	8564R	AATAATGGGATCCATTTTGTCCC	61	1180
J	8186F	RGATATCCACAARAGCATAACC	59	1612
J	9798R	AAGRTTATTRTCACCTGCAAGC	59	1612
K	9465F	CTTAGGSTTRAGATGYGGATTC	58	1000
K	10465R	GTCTRAACATWCCCGGTTGC	58	1000
L	10212F	YTRATATGGACTAGTTTYCCTAG	54	1546
L	11758R	TTGTCTYTCAGAYCCTAAAGC	54	1546
M	11518F	ATGYTRTTTGGTGGTGGTGATCC	59	1305
M	12823R	GGAGGTTTCATCAAATGTATCTC	59	1305
N	12589F	YTWACAGTCAGTAGTAGACCATG	56	1414
N	14003R	YATGATGCCAAGGAAGCATGC	56	1414
O	13904F	RRATTATAGATCATTCAGGTAATACAG	53	1118
O	15022R	AMACTACCTGTRATTTTAATCAG	53	1118
P	1F	ACGGGAAAAAAWGCGTACWACAAAC	59	1144
P	1144R*	CAACTTGACTTTGCTAAGAGC	59	1144
Q	13904F*	RRATTATAGATCATTCAGGTAATACAG	55	1277
Q	15181R	ATCAAARTGRTAATTTTAGGATTAGTTCAC	55	1277

Primer pair (name)	Primer (position Fw, Rv)	Sequence (5′–3′)	T (°C) annealing	Amplicon size
A	43F	TGGGGCAAATAAGAATTTGATAAG	57	1101
A	1144R	CAACTTGACTTTGCTAAGAGC	57	1101
B	827F	RTGAARCTATTGCACAAAGTRGG	58	1409
B	2236R	ATCTTGRTTYCTCGGTGTACCTC	58	1409
C	2018F	ARTGGGWGGAGAAGCWGG	60	1237
C	3255R	CACGTATGTTTCCATATTTGCCC	60	1237
D	2951F	GAAAGYGAAAARATGGCAAAAGACAC	58	1036
D	3987R	CGTGTAGCTGTRTGCTTCC	58	1036
E	3568F	RGCATATGATGTAACYACACC	51	1167
E	4735R	GATTRAGAGTRTCCCAGGTC	51	1167
F	4329F	AATTCWCAAGCAAATTYTGGCC	58	1512
F	5841R	CTYTGCTAACTGCACTRCATG	58	1512
G	5620F	CTGGGGCAAATAACMATGGAG	61	1579
G	7199R	CATTGACTTGAGATATTGATGCATC	61	1579
H	6966F	GAGCYATWGTGTCATGCTATGG	60	705
H	7671R	CTTCGYGACATATTTGCCCC	60	705
I	7384F	TAARGCCARAARCACACCAG	61	1180
I	8564R	AATAATGGGATCCATTTTGTCCC	61	1180
J	8186F	RGATATCCACAARAGCATAACC	59	1612
J	9798R	AAGRTTATTRTCACCTGCAAGC	59	1612
K	9465F	CTTAGGSTTRAGATGYGGATTC	58	1000
K	10465R	GTCTRAACATWCCCGGTTGC	58	1000
L	10212F	YTRATATGGACTAGTTTYCCTAG	54	1546
L	11758R	TTGTCTYTCAGAYCCTAAAGC	54	1546
M	11518F	ATGYTRTTTGGTGGTGGTGATCC	59	1305
M	12823R	GGAGGTTTCATCAAATGTATCTC	59	1305
N	12589F	YTWACAGTCAGTAGTAGACCATG	56	1414
N	14003R	YATGATGCCAAGGAAGCATGC	56	1414
O	13904F	RRATTATAGATCATTCAGGTAATACAG	53	1118
O	15022R	AMACTACCTGTRATTTTAATCAG	53	1118
P	1F	ACGGGAAAAAAWGCGTACWACAAAC	59	1144
P	1144R*	CAACTTGACTTTGCTAAGAGC	59	1144
Q	13904F*	RRATTATAGATCATTCAGGTAATACAG	55	1277
Q	15181R	ATCAAARTGRTAATTTTAGGATTAGTTCAC	55	1277

These primers are also included in pairs A and O.

A diagram illustrating amplicon distribution and overlap is presented in Fig. 1A, while the sequencing protocol is summarized in Fig. 1B.

Primer specificity was confirmed by conventional PCR with cDNA from piMø, hRSV positive nasopharyngeal swabs, and mock-infected macrophages. Seventeen amplicons of the expected size were synthesized when cDNA from the hRSV-infected sources was used as a template (Fig. 2A–C), while cDNA from mock-infected cells showed negative amplification (Fig. 2A–B), confirming primers’ specificity. Three duplex PCRs were achieved with primer pairs D/M, C/H and G/I, to amplify the hRSV genome from nasal swabs in 14 reactions (Fig. 2D). However, some duplex reactions produced lower yields of the shortest amplicons; therefore, we recommend performing singleplex PCRs.

Figure 2.

hRSV-A genome amplification. (A) and (B) PCRs were prepared with cDNA from hRSV-persistently infected macrophages (piMø) or mock-infected macrophages to validate the 17-primer panel. Amplicons were produced only from the hRSV-infected sample (I), in contrast to the mock-infected sample (M). (C) Representative gels with 17 amplicons synthesized from a hRSV-A positive nasopharyngeal swab. (D) Three duplex reactions accomplished by combining primers D/M, C/H and G/I.

Open in new tab Download slide

The Nanopore platform was employed for sequencing between 3200 and 5400 reads per sample. We obtained a sequence length of 97.7% ± 1.8% for samples with Ct values of 23.2 ± 1.9 and 96.8% ± 3.3% for hRSV-A samples with Ct values of 28.7 ± 1.1; nevertheless, five samples exhibited values <90%. The coverage of each contig ranged from 130× to 200× (Table 3). This protocol enabled successful sequencing of nasopharyngeal swabs with Ct values up to 30.

Table 3.

Open in new tab

Nanopore sequencing results.

Sample	Assembled reads	Sequence length (%)	Mean depth coverage*
1	4513	15034 (99.5)	137
2	5302	14838 (99.5)	198
3	3296	14936 (93.4)	138
4	4704	15031 (91.6)	148
5	4688	15102 (99.9)	139
6	4571	15018 (99.4)	148
7	4734	12510 (82.8)	159
8	4617	13530 (89.5)	145
9	4644	15056 (93.1)	137
10	4579	15028 (94.4)	145
11	4555	12955 (85.7)	132
12	4645	15029 (94.4)	121
13	4795	15078 (94.8)	151
14	4669	15031 (99.4)	143
15	4558	15034 (99.5)	136
16	4587	15026 (99.5)	125
17	4602	13530 (89.5)	130
18	4801	11510 (76.2)	144
19	4725	15078 (99.5)	128
20	4807	15089 (97.8)	138
21	4537	14829 (98.5)	199
22	4819	14876 (96.6)	145
23	5430	15036 (98.9)	200
24	5336	15024 (98.8)	200

Sample	Assembled reads	Sequence length (%)	Mean depth coverage*
1	4513	15034 (99.5)	137
2	5302	14838 (99.5)	198
3	3296	14936 (93.4)	138
4	4704	15031 (91.6)	148
5	4688	15102 (99.9)	139
6	4571	15018 (99.4)	148
7	4734	12510 (82.8)	159
8	4617	13530 (89.5)	145
9	4644	15056 (93.1)	137
10	4579	15028 (94.4)	145
11	4555	12955 (85.7)	132
12	4645	15029 (94.4)	121
13	4795	15078 (94.8)	151
14	4669	15031 (99.4)	143
15	4558	15034 (99.5)	136
16	4587	15026 (99.5)	125
17	4602	13530 (89.5)	130
18	4801	11510 (76.2)	144
19	4725	15078 (99.5)	128
20	4807	15089 (97.8)	138
21	4537	14829 (98.5)	199
22	4819	14876 (96.6)	145
23	5430	15036 (98.9)	200
24	5336	15024 (98.8)	200

Depth coverages were calculated considering only the successfully sequenced amplicons.

Table 3.

Open in new tab

Nanopore sequencing results.

Sample	Assembled reads	Sequence length (%)	Mean depth coverage*
1	4513	15034 (99.5)	137
2	5302	14838 (99.5)	198
3	3296	14936 (93.4)	138
4	4704	15031 (91.6)	148
5	4688	15102 (99.9)	139
6	4571	15018 (99.4)	148
7	4734	12510 (82.8)	159
8	4617	13530 (89.5)	145
9	4644	15056 (93.1)	137
10	4579	15028 (94.4)	145
11	4555	12955 (85.7)	132
12	4645	15029 (94.4)	121
13	4795	15078 (94.8)	151
14	4669	15031 (99.4)	143
15	4558	15034 (99.5)	136
16	4587	15026 (99.5)	125
17	4602	13530 (89.5)	130
18	4801	11510 (76.2)	144
19	4725	15078 (99.5)	128
20	4807	15089 (97.8)	138
21	4537	14829 (98.5)	199
22	4819	14876 (96.6)	145
23	5430	15036 (98.9)	200
24	5336	15024 (98.8)	200

Sample	Assembled reads	Sequence length (%)	Mean depth coverage*
1	4513	15034 (99.5)	137
2	5302	14838 (99.5)	198
3	3296	14936 (93.4)	138
4	4704	15031 (91.6)	148
5	4688	15102 (99.9)	139
6	4571	15018 (99.4)	148
7	4734	12510 (82.8)	159
8	4617	13530 (89.5)	145
9	4644	15056 (93.1)	137
10	4579	15028 (94.4)	145
11	4555	12955 (85.7)	132
12	4645	15029 (94.4)	121
13	4795	15078 (94.8)	151
14	4669	15031 (99.4)	143
15	4558	15034 (99.5)	136
16	4587	15026 (99.5)	125
17	4602	13530 (89.5)	130
18	4801	11510 (76.2)	144
19	4725	15078 (99.5)	128
20	4807	15089 (97.8)	138
21	4537	14829 (98.5)	199
22	4819	14876 (96.6)	145
23	5430	15036 (98.9)	200
24	5336	15024 (98.8)	200

Depth coverages were calculated considering only the successfully sequenced amplicons.

Phylogenetic analyses were performed with the consensus sequences, in vitro isolates, and sequences from USA during 2022. These analyses confirmed the identity of the virus from cell cultures and hRSV-A Mexican sequences from winter season 2022–2023 (Fig. 3). We detected the most recent clades of hRSV-A, A.D.1, A.D.3, A.D.5.2, and A.D.5.3.

Figure 3.

Maximum likelihood phylogenetic tree of whole genome consensus sequences for 17 samples from this study (red dots) and 307 other sequences of hRSV-A from the USA from 2022. Viral isolates from cell culture and the reference strain Long (AY911262.1) are marked with green dots. piM0: persistent hRSV isolate from piMø. In vitro adapted: hRSV-A infecting Vero cells for 24 h.

Open in new tab Download slide

Discussion

Complete hRSV genome sequence data are of relevance for molecular epidemiology and surveillance of circulating viral variants. Furthermore, they are useful for the identification of molecular targets, thereby facilitating the design of vaccines, antiviral drugs, and immunotherapy [37, 38].

In this study, we present a protocol for the comprehensive sequencing of the hRSV-A genome. This protocol is suitable for virus isolates from respiratory specimens and cell culture-adapted viruses, including a persistently infected macrophage cell line with over 150 passages [30, 31]. The RNA extraction process was optimized by using 280 µl of nasopharyngeal swab specimen. This approach resulted in the successful isolation of total RNA from 22 samples, exhibiting concentrations ranging from 119.7 to 233.8 ng/µl, allowing cDNA preparation from a standardized quantity of 1 µg of RNA. The preparation of aliquots from respiratory samples to prevent repeated freeze-thaw cycles, coupled with the performance of reverse transcription reactions in the presence of a ribonuclease inhibitor, effectively preserved RNA quality. This procedure was important in optimizing amplicon synthesis in subsequent PCRs. We selected the long-read sequencer MinION, obtaining an accurate full-length consensus sequence, adapting the correction tool Nanopolish in the bioinformatic pipeline [32]. hRSV from nasopharyngeal swabs as expected, belonged to subgroup A and were grouped together with virus circulating in the USA during 2022. Interestingly, viruses from nasopharyngeal swabs clustered into four different clades, indicating that this protocol is successful for identification of genotypes, as primers were designed to target highly conserved regions of the hRSV-A. The G gene has historically been used to classify hRSV-A and hRSV-B, but this classification has limitations when relying solely on one gene. Analyzing the complete viral genome provides a more comprehensive genetic information and could serve as an alternative for establishing a more accurate genomic epidemiology. By using the complete genome, we successfully classified 17 samples from Mexico collected during the post-pandemic winter season 2022–2023 and identify them within the most recent clades A3 [22, 39, 40]. The persistent hRSV-A isolate from piMø and the strain Long (AY911262.1) were grouped together in a distinct clade. This result was anticipated, considering that the persistent virus originates from the prototype strain Long, which was isolated in 1956. Since then, hRSV-A has undergone numerous genomic changes [3].

Complete genome amplification was accomplished not only from samples with low Ct values but also from samples with a Ct of 30, enabling sequencing of >93% of the hRSV-A genome. A previous protocol devised by Wang et al., showed genome coverage >99% for samples with Ct ≤ 24 amplified with 10 primer pairs in a one-step RT-PCR. Furthermore, genome coverage >99.6% was achieved for samples with Ct values ranging from 24 to 33 using nested PCRs, which involved 40 reactions per sample, thereby rendering it a high-cost method [28]. Our approach showed genome coverage exceeding 98.5% for 11/20 samples, with Ct values ranging from 10 to 29.6, when utilizing the amplification protocol with 17 primer pairs in individual PCRs. There is a potential to optimize and diminish the number of reactions to 14, thereby reducing amplification costs.

A limitation of our protocol involves a small sample size during validation and its focus on the hRSV subgroup A. However, as the hRSV season is currently underway, we have access to additional samples to assess and verify the performance of our method. Additionally, it is important to corroborate well-visible bands for each amplicon in agarose electrophoresis to secure whole-genome coverage, since low yield of certain amplicons reduces sequence lengths.

This study enabled the sequencing of complete hRSV-A genomes with depth coverages exceeding 100× and genome coverages ranging from 93% to 99% from nasopharyngeal swabs with Ct values up to 30. High deep coverage is necessary to detect minor variants in viral genomes, as well as polymorphisms frequently observed in RNA virus (quasispecies). For complete genome analysis of severe acute respiratory syndrome coronavirus 2, 60× nanopore deep coverage ensures suitability [41]. Based on these findings, our results are of significant value for molecular epidemiology, the identification of determinants of pathogenicity, and evolutionary studies of hRSV [42].

Acknowledgements

We thank Ana Flisser for providing infrastructure and comments to this study.

Author contributions

Joel Armando Vázquez-Pérez (Conceptualization [supporting], Formal analysis [lead], Methodology [equal], Writing—original draft [equal]), Eber Martínez-Alvarado (Investigation [supporting], Methodology [equal]), Alberto Antony Venancio-Landeros (Investigation [supporting], Methodology [supporting]), Carlos Santiago-Olivares (Methodology [supporting]), Fidencio Mejía-Nepomuceno (Methodology [supporting]), Enrique Mendoza-Ramírez (Methodology [supporting]), and Evelyn Rivera-Toledo (Conceptualization [lead], Formal analysis [equal], Funding acquisition [lead], Investigation [lead], Methodology [lead], Project administration [lead], Supervision [lead], Writing—original draft [lead])

Supplementary data

Supplementary data is available at Biology Methods and Protocols online.

Conflict of interest statement. None declared.

Funding

This work was supported by Grant IA205521, Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica (PAPIIT), Direccion General de Asuntos Personal Academico (DGAPA), and Universidad Nacional Autonoma de Mexico (UNAM), Mexico.

Data availability

The data underlying this article will be shared on request to the corresponding author.

References

Afonso

Amarasinghe

Bányai

et al.

Taxonomy of the order mononegavirales: update 2016

Arch Virol

2016

;

161

2351

–

Month:	Total Views:
February 2024	387
March 2024	209
April 2024	114
May 2024	131
June 2024	81
July 2024	81
August 2024	146
September 2024	93
October 2024	65
November 2024	79
December 2024	96
January 2025	121
February 2025	73
March 2025	125
April 2025	58

Article Contents

An amplicon-based protocol for whole-genome sequencing of human respiratory syncytial virus subgroup A

Abstract

Introduction

Materials and methods

Clinical samples

Cell culture and hRSV infection

RNA extraction

One-step quantitative RT-PCR

Primer design

hRSV-A genome amplification by two-step RT-PCR

Agarose gel electrophoresis

Library preparation and viral genome sequencing

Sequence data analysis

Phylogenetic analysis

Results

Discussion

Acknowledgements

Author contributions

Supplementary data

Funding

Data availability

References

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Most Read

Most Cited

Latest

This Feature Is Available To Subscribers Only