-
PDF
- Split View
-
Views
-
Cite
Cite
Agnieszka Razim, Katarzyna Pacyga-Prus, Wioletta Kazana-Płuszka, Agnieszka Zabłocka, Józefa Macała, Hubert Ciepłucha, Andrzej Gamian, Sabina Górska, Differential patterns of antibody response against SARS-CoV-2 nucleocapsid epitopes detected in sera from patients in the acute phase of COVID-19, convalescents, and pre-pandemic individuals, Pathogens and Disease, Volume 82, 2024, ftae025, https://doi.org/10.1093/femspd/ftae025
- Share Icon Share
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has already infected more than 0.7 billion people and caused over 7 million deaths worldwide. At the same time, our knowledge about this virus is still incipient. In some cases, there is pre-pandemic immunity; however, its source is unknown. The analysis of patients’ humoral responses might shed light on this puzzle. In this paper, we evaluated the antibody recognition of nucleocapsid protein, one of the structural proteins of SARS-CoV-2. For this purpose, we used pre-pandemic acute COVID-19 and convalescent patients’ sera to identify and map nucleocapsid protein epitopes. We identified a common epitope KKSAAEASKKPRQKRTATKA recognized by sera antibodies from all three groups. Some motifs of this sequence are widespread among various coronaviruses, plants or human proteins indicating that there might be more sources of nucleocapsid-reactive antibodies than previous infections with seasonal coronavirus. The two sequences MSDNGPQNQRNAPRITFGGP and KADETQALPQRQKKQQTVTL were detected as specific for sera from patients in the acute phase of infection and convalescents making them suitable for future development of vaccines against SARS-CoV-2. Knowledge of the humoral response to SARS-CoV-2 infection is essential for the design of appropriate diagnostic tools and vaccine antigens.
Introduction
SARS-CoV-2 is a member of the Betacoronavirus genus in the Coronaviridae family (Gorbalenya et al. 2020). There are two seasonal coronaviruses in the same genus, namely HCoV-HKU1 and HCoV-OC43, and two more in the Alphacoronavirus genus, HCoV-229E and HCoV-NL63 (Killerby et al. 2018). Usually, infections with seasonal coronaviruses are frequent but mild, and the acquired immunity is short-term. There are four main structural proteins that make up the SARS-CoV-2 virion: nucleocapsid (N), spike (S), membrane (M), and envelope (E) proteins (Chen et al. 2020). N protein is used for viral genome packaging, has conserved amino acid sequence among the Coronaviridae family, and is highly immunogenic and expressed in great amounts during the infection. Infected patients are producing excessive amounts of N-specific antibodies. N-protein was already identified as efficient diagnostic tool for detection of SARS-CoV-2 infection and was suggested to be an interesting vaccine antigen (Bai et al. 2021).
There are already multiple yet often conflicting reports concerning pre-pandemic immunity against SARS-CoV-2 (Sealy and Hurwitz 2021). Around 20% of individuals possess SARS-CoV-2 pre-pandemic cross-reactive serum antibodies, mostly against nucleocapsid. The high cross-reactivity of anti-SARS-CoV-2 antibodies and other coronaviruses was demonstrated in sub-Saharan Africa pre-pandemic samples (Tso et al. 2021), in which the nucleocapsid was the dominant cross-recognized antigen. COVID-19 burden is much lower in sub-Saharan Africa than in the USA and coincides with higher levels of serological cross-reactivity of plasma samples from Tanzania and Zambia comparing to the ones from the USA (Tso et al. 2021). Sagar et al. showed that patients with a recent and documented history of common cold HCoV infection have improved rates of survival when acquired COVID-19 (Sagar et al. 2021). In contrast, other studies demonstrated that past infection with seasonal coronavirus does not provide a protective immunity against SARS-CoV-2 (Anderson et al. 2021, Gombar et al. 2021). The explanation of these differences is important due to possible immunopathological effects of SARS-CoV-2. Another question is whether past seasonal infection is the only source of SARS-CoV-2 cross-reactive antibodies. There are studies that indicate a link between influenza vaccination and reduced incidence or severity of COVID-19 (Conlon et al. 2021, Tayar et al. 2023). Whether this is related to the production of cross-reactive antibodies has not been clearly established, although there are common elements between the influenza and coronavirus, e.g. sialic acids (Matrosovich et al. 2015).
In this study, we evaluated the pre-pandemic sera in terms of their cross-reactivity with SARS-CoV-2 proteins and identified nucleocapsid as one of the most recognized proteins. Next, we analysed the sera of acute and convalescent COVID-19 patients’ sera in terms of anti-N antibodies. Finally, we performed in silico and empirical epitope mapping of N-protein. The obtained results shed new light on the possible origin of pre- and post-infection immunity and cross-reactivity, which should be taken into account when designing new N-protein based vaccine antigens.
Methods
Human blood sera
Blood was obtained from patients with confirmed SARS-CoV-2 infection (moderate and severe cases) and admitted to the Infectious Diseases Admission Room of J. Gromkowski Hospital in Wrocław between November 2020 and April 2021. Blood was drawn on admission to the hospital (acute COVID-19) and 3 weeks after the last symptoms of infection had ceased (COVID-19 convalescents). Serum from healthy volunteers was collected prior to the pandemic and was previously used in our studies (Razim et al. 2018, 2021). Individual or pooled sera were used for the experiments.
For each patient, a written informed consent was obtained for the study, and the experiment itself was approved by the bioethical committee no. KB-683/2020 (Bioethical Committee at the Medical University of Wrocław). Experiments were conducted in accordance with the Helsinki Declaration, 1975.
Western blots
For sodium dodecyl-sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) , the 4%–20% Mini-PROTEAN TGX Precast Protein Gels (Bio-Rad, USA) were used. 7 µg of SARS-CoV-2 lysate (cat. NAT41605-500, The Native Antigen Company, UK) was put on each lane. 2 µl of Precision Plus Protein Dual Colour Standards (Bio-Rad, USA) were used as protein mass marker. Proteins were blotted on the nitrocellulose pore size of 0.45 µm, which was then blocked with Pierce™ Clear Milk Blocking Buffer (Thermo Fisher Scientific, USA). Individual sera were diluted 1:100 (pre-pandemic sera) or 1:350 (acute and convalescent sera) times in Tris-buffered saline with 0.1% Tween 20 (TBS-T). Anti-human IgA antibodies (SAB3701233, Merck, Germany), anti-human IgG antibodies (cat. SAB3701340, Merck) were diluted 1:2000 and 1:7500, respectively, in Pierce™ Clear Milk Blocking Buffer in TBS-T. Colour reaction was developed with a 1:1 mix of nitro blue tetrazolium (Sigma Aldrich, USA) and 5-bromo-4-chloro-3-indolyl phosphate (Sigma Aldrich). Blots were documented with Gel Doc System (Bio-Rad).
Enzyme-linked immunosorbent assay
96-well MaxiSorp plates were coated overnight with 2 µg/ml SARS-CoV-2 nucleocapsid protein (ab273530, Abcam, UK) in carbonate buffer. Plates were blocked with SuperBlock™ T20 (cat. 37 516, Thermo Fisher Scientific). After washing, plates were incubated with pooled sera diluted 1:400 (for IgG) or 1:100 (for IgA) in TBS-T for 2 h. Next, after washing, plates were incubated with secondary antibodies anti-human IgG (1:7500) or anti-human IgA (1:30 000). Colour reaction was developed with Alkaline Phosphatase Yellow (Merck). Plates were read at 405 nm with plate reader (PowerWave HT, BioTek Instruments, USA).
In silico predictions of epitopes
SARS-CoV-2 sequence used for bioinformatics predictions was Uniprot P0DTC9 (NCAP_SARS2). 20-aminoacid long B-cell epitope predictions were performed using: BepiPred 2.0 (Jespersen et al. 2017) with 0.6 threshold, 0.95116 specificity and 0.09559 sensitivity; BCPred (EL-Manzalawy et al. 2008) with 75% specificity and 0.99 cut-off; ABCPred (Saha and Raghava 2006) with 0.6 threshold. NetCTL 1.2 (Larsen et al. 2007) was used for T-cell epitope predictions, for all supertypes and with default settings. IFNepitope (Dhanda et al. 2013) was utilized for IFNγ epitope search within the SARS-CoV-2 sequence using Motif and SVM hybrid approachs. Results of all predictions were compared with each other, and overlapping sequences covering as many epitopes as possible were selected for further analysis.
Peptide synthesis and mapping
Peptides were synthesized on plastic pins (noncleavable peptide type, MIMOTOPES, Melbourne, Australia) using the PEPSCAN method (Carter 1996), modified by Jarząb et al. (2013). Shortly, peptides were synthesized in a 96-well plate by adding one F-moc amino acid derivative to each pin during one coupling reaction until a full-length peptide was obtained. After deprotecting of the side chains, pins were dried and stored at −20°C until used.
The immunoreactivity of pin-bound peptides was tested by enzyme-linked immunosorbent assay (ELISA) against three groups of pooled sera: acute COVID-19, COVID-19 convalescents, and pre-pandemic sera as previously published (Pacyga et al. 2020). Shortly, plastic pins were blocked with 1% bovine serum albumin (BSA) in TBS-T; incubated with primary antibodies (acute COVID-19, COVID-19 convalescents, or pre-pandemic sera) in a 1:1000 dilution in TBS-T with 0.1% BSA; after washing with TBS-T they were incubated with AP-conjugated secondary antibodies in a 1:7500 dilution (IgG) or 1:30 000 dilution (IgA) in TBS-T; colorimetric reaction was developed with Alkaline Phosphatase Yellow. Plates were read at 405 nm. After assay, bound antibodies were stripped off by sonication in disruption buffer (1% sodium dodecyl sulphate, 0.1% 2-mercaptoethanol, and 0.1 M Na3PO4) preheated to 60°C, washed in water and methanol, and dried. Each experiment was repeated at least six times. Epitopes were verified by calculating statistical significance of their level of immunoreactivity (absorbance value) against thresholds calculated for each group. Thresholds were calculated as previously published (Pacyga et al. 2020) by counting the mean of all results within the given group. Calculated thresholds: acute COVID-19 group 1.142 for IgG and 0.615 for IgA; COVID-19 convalescents 1.273 for IgG and 0.507 for IgA; pre-pandemic group 0.413 for IgG and 0.348 for IgA.
Amino acid sequence analysis
Basic Local Alignment Search Tool (BLAST) on the UniProt server (https://www.uniprot.org/blast/) was used to search for similar amino acid sequences in proteins from other organisms (Altschul et al. 1997). UniProtKB settings: reference proteomes plus Swiss-Prot database, E-threshold = 10, auto matrix, no filtering, gapped sequences included.
The Immune Epitope Database (IEDB) was used to search for known epitopes with similar sequences (Vita et al. 2019). The search was limited to sequences that are identical with the sequence of interest in at least 70%. It was not restricted to any host, major histocompatibility complex (MHC), or disease. Duplicate sequences have been removed from the list.
Statistical analysis of the data
Statistical analysis was performed in GraphPad Prism Software 9.3 (GraphPad Software Inc.). Ordinary one-way ANOVA with a post-hoc Tukey’s multiple comparison test was used for nucleocapsid ELISA. Ordinary one-way ANOVA with a Dunnett’s multiple comparisons test was used to analyse the data for epitope mapping. Data are shown as mean ± standard deviation. Significant differences are marked as (α = 0.05): *P < .05; **P < .01; ***P < .001; ****P < .0001. Thresholds were calculated as previously published (Pacyga et al. 2020) by counting the mean of all results within the given group.
Results
Some Polish patients have pre-pandemic immunity against SARS-CoV-2 nucleocapsid
The western blot analysis of the immunoreactivity of pre-pandemic human individual sera against SARS-CoV-2 lysate, probed with secondary anti-IgA antibodies, revealed one protein band (Fig. 1A). The recognized protein molecular mass was 46–48 kDa, which is corresponding to the molecular mass of N-protein. To confirm the identity of the protein band, we performed mass spectrometric identification of the SDS-PAGE gel cut-out band of the size 46–48 kDa (Fig. S1). We did not see any immunoreactivity when we examined the IgG antibody response of the same group of sera.

Recognition of SARS-CoV-2 N-protein by pre-immune sera. (A) The result of western blot performed on SARS-CoV-2 lysate and pre-pandemic individual sera. (B) The result of ELISA performed on recombinant SARS-CoV-2 N-protein to measure the level of specific IgG antibodies in pooled sera (means ± standard deviations). (C) The result of ELISA performed on recombinant SARS-CoV-2 N-protein to measure the level of specific IgA antibodies in pooled sera (means ± standard deviations). Ordinary one-way ANOVA with a post-hoc Tukey’s multiple comparison test was used for data analysis, significant differences are marked as (α = 0.05): *P < .05; **P < .01; ***P < .001; ****P < .0001.
In order to quantify the immunoreactivity of pre-pandemic sera, we performed ELISA on a recombinant nucleocapsid and compared it to the response of acute COVID-19 (n = 30) and convalescent COVID-19 (n = 27) patient pooled sera (Fig. 1B and C). The level of specific anti-nucleocapsid IgG and IgA antibodies is rather low in the pre-pandemic group when compared to acute and convalescent groups. However, there are some higher readouts in IgA pre-pandemic group (n = 16). It seems that the pre-pandemic immunity is not wide-spread but rather limited. However, due to the unexpected presence of antibodies against nucleocapsid in some of the pre-pandemic sera, we decided to study N-protein more closely.
Humoral response against SARS-CoV-2 nucleocapsid is different for convalescents
The analysis of immunoreactivity of acute and convalescent sera showed variability in the immune response against SARS-CoV-2 lysate (Fig. 2). Generally, all tested patients showed strong IgG and IgA anti-N protein antibody response during the acute phase of infection as well as 3 weeks after the last symptoms of infection have ceased. However, the immunoreactive profile was changing during the recovery and varies between patients. Immunoreactive profiles of sera of patients 2 and 3 are enriched in an additional protein band (∼38 kDa) in the convalescent state, both in case of specific IgGs and IgAs. Immunoreactive profile of patient 4 sera seems to be enriched in multiple bands in the convalescent state, both for IgG and IgA antibodies.

Immunoreactivity profiles of acute and convalescent COVID-19 patients. Sera was obtained from each patient during admission to the hospital and three weeks after the last symptoms of infection have ceased. All blots were performed in the same conditions.
Epitope mapping of SARS-CoV-2 nucleocapsid reveals different epitope patterns for acute, convalescent and pre-pandemic groups of sera
The in silico prediction of SARS-CoV-2 N-protein epitopes revealed 20 sequences (Table S1). Most of them were B-cell epitopes, and their immunoreactivity was verified using sera of all tested groups of patients.
Empirical epitope mapping by using patients’ sera revealed that overall reactivity is higher in the convalescent SARS-CoV-2 patient group comparing to other groups (Fig. 3). However, the profiles of immunoreactivity are rather the same among the tested groups i.e. the same sequences are the most immunoreactive with tested sera. The profile and overall level of reactivity of pre-pandemic group is much lower than for the acute and convalescent COVID-19 groups.

Epitope mapping of SARS-CoV-2 N-protein—pattern of sequences recognized by IgG antibodies. Assay was performed five times, the data shown is means with ± standard deviation. Indicated thresholds for each tested group (dotted lines on the graph ) were calculated by counting the mean of all results within given group.
We calculated thresholds of immunoreactivity for each of the tested groups and performed statistical analysis to indicate the N-protein epitopes recognized by IgG antibodies (Table 1). There are only three IgG epitopes identified in the pre-pandemic sera group, which are numbered (asterisks indicate statistical significance): 14 KKSAAEASKKPRQKRTATKA (****), 9 SDSTGSNQNGERSGARSKQR (**), and 3 IGYYRRATRRIRGGDGKMKD (*). Only sequences 3 and 9 were indicated as B-cell epitopes in the in silico prediction. Sequence 14 was indicated as IFNγ epitope. Particularly interesting is that most of the epitopes identified with the sera from the acute COVID-19 group are not identified with the sera of COVID-19 convalescents.
Peptide number . | Peptide sequence . | COVID-19 acute . | COVID-19 convalescents . | Pre-pandemic . |
---|---|---|---|---|
1 | MSDNGPQNQRNAPRITFGGP | *** | * | ns |
2 | RPQGLPNNTASWFTALTQHG | ns | ns | ns |
3 | IGYYRRATRRIRGGDGKMKD | ** | ns | * |
4 | LNTPKDHIGTRNPANNAAIV | ns | ns | ns |
5 | GSQASSRSSSRSRNSSRNST | ** | ns | ns |
6 | NQLESKMSGKGQQQQGQTVT | ns | ns | ns |
7 | RRGPEQTQGNFGDQELIRQG | ns | ns | ns |
8 | IAQFAPSASAFFGMSRIGME | ns | ns | ns |
9 | SDSTGSNQNGERSGARSKQR | **** | ns | ** |
10 | LKFPRGQGVPINTNSSPDDQ | ns | ns | ns |
11 | GTGPEAGLPYGANKDGIIWV | ns | ns | ns |
12 | LQLPQGTTLPKGFYAEGSRG | ns | ns | ns |
13 | TPGSSRGTSPARMAGNGGDA | ns | ns | ns |
14 | KKSAAEASKKPRQKRTATKA | **** | * | **** |
15 | GDQELIRQGTDYKHWPQIAQ | ns | ns | ns |
16 | DPNFKDQVILLNKHIDAYKT | ns | ns | ns |
17 | FPPTEPKKDKKKKADETQAL | * | ns | ns |
18 | LDDFSKQLQQSMSSADSTQA | ns | ns | ns |
19 | KADETQALPQRQKKQQTVTL | ** | *** | ns |
20 | GDAALALLLLDRLNQLESKM | ns | ns | ns |
Peptide number . | Peptide sequence . | COVID-19 acute . | COVID-19 convalescents . | Pre-pandemic . |
---|---|---|---|---|
1 | MSDNGPQNQRNAPRITFGGP | *** | * | ns |
2 | RPQGLPNNTASWFTALTQHG | ns | ns | ns |
3 | IGYYRRATRRIRGGDGKMKD | ** | ns | * |
4 | LNTPKDHIGTRNPANNAAIV | ns | ns | ns |
5 | GSQASSRSSSRSRNSSRNST | ** | ns | ns |
6 | NQLESKMSGKGQQQQGQTVT | ns | ns | ns |
7 | RRGPEQTQGNFGDQELIRQG | ns | ns | ns |
8 | IAQFAPSASAFFGMSRIGME | ns | ns | ns |
9 | SDSTGSNQNGERSGARSKQR | **** | ns | ** |
10 | LKFPRGQGVPINTNSSPDDQ | ns | ns | ns |
11 | GTGPEAGLPYGANKDGIIWV | ns | ns | ns |
12 | LQLPQGTTLPKGFYAEGSRG | ns | ns | ns |
13 | TPGSSRGTSPARMAGNGGDA | ns | ns | ns |
14 | KKSAAEASKKPRQKRTATKA | **** | * | **** |
15 | GDQELIRQGTDYKHWPQIAQ | ns | ns | ns |
16 | DPNFKDQVILLNKHIDAYKT | ns | ns | ns |
17 | FPPTEPKKDKKKKADETQAL | * | ns | ns |
18 | LDDFSKQLQQSMSSADSTQA | ns | ns | ns |
19 | KADETQALPQRQKKQQTVTL | ** | *** | ns |
20 | GDAALALLLLDRLNQLESKM | ns | ns | ns |
Immunoreactivity of each sequence in the group was tested against the threshold calculated for the whole group using one-way ANOVA and the Dunnett’s multiple comparison test (α = 0.05): *P < .05; **P < .01; ***P < .001; ****P < .0001.
Peptide number . | Peptide sequence . | COVID-19 acute . | COVID-19 convalescents . | Pre-pandemic . |
---|---|---|---|---|
1 | MSDNGPQNQRNAPRITFGGP | *** | * | ns |
2 | RPQGLPNNTASWFTALTQHG | ns | ns | ns |
3 | IGYYRRATRRIRGGDGKMKD | ** | ns | * |
4 | LNTPKDHIGTRNPANNAAIV | ns | ns | ns |
5 | GSQASSRSSSRSRNSSRNST | ** | ns | ns |
6 | NQLESKMSGKGQQQQGQTVT | ns | ns | ns |
7 | RRGPEQTQGNFGDQELIRQG | ns | ns | ns |
8 | IAQFAPSASAFFGMSRIGME | ns | ns | ns |
9 | SDSTGSNQNGERSGARSKQR | **** | ns | ** |
10 | LKFPRGQGVPINTNSSPDDQ | ns | ns | ns |
11 | GTGPEAGLPYGANKDGIIWV | ns | ns | ns |
12 | LQLPQGTTLPKGFYAEGSRG | ns | ns | ns |
13 | TPGSSRGTSPARMAGNGGDA | ns | ns | ns |
14 | KKSAAEASKKPRQKRTATKA | **** | * | **** |
15 | GDQELIRQGTDYKHWPQIAQ | ns | ns | ns |
16 | DPNFKDQVILLNKHIDAYKT | ns | ns | ns |
17 | FPPTEPKKDKKKKADETQAL | * | ns | ns |
18 | LDDFSKQLQQSMSSADSTQA | ns | ns | ns |
19 | KADETQALPQRQKKQQTVTL | ** | *** | ns |
20 | GDAALALLLLDRLNQLESKM | ns | ns | ns |
Peptide number . | Peptide sequence . | COVID-19 acute . | COVID-19 convalescents . | Pre-pandemic . |
---|---|---|---|---|
1 | MSDNGPQNQRNAPRITFGGP | *** | * | ns |
2 | RPQGLPNNTASWFTALTQHG | ns | ns | ns |
3 | IGYYRRATRRIRGGDGKMKD | ** | ns | * |
4 | LNTPKDHIGTRNPANNAAIV | ns | ns | ns |
5 | GSQASSRSSSRSRNSSRNST | ** | ns | ns |
6 | NQLESKMSGKGQQQQGQTVT | ns | ns | ns |
7 | RRGPEQTQGNFGDQELIRQG | ns | ns | ns |
8 | IAQFAPSASAFFGMSRIGME | ns | ns | ns |
9 | SDSTGSNQNGERSGARSKQR | **** | ns | ** |
10 | LKFPRGQGVPINTNSSPDDQ | ns | ns | ns |
11 | GTGPEAGLPYGANKDGIIWV | ns | ns | ns |
12 | LQLPQGTTLPKGFYAEGSRG | ns | ns | ns |
13 | TPGSSRGTSPARMAGNGGDA | ns | ns | ns |
14 | KKSAAEASKKPRQKRTATKA | **** | * | **** |
15 | GDQELIRQGTDYKHWPQIAQ | ns | ns | ns |
16 | DPNFKDQVILLNKHIDAYKT | ns | ns | ns |
17 | FPPTEPKKDKKKKADETQAL | * | ns | ns |
18 | LDDFSKQLQQSMSSADSTQA | ns | ns | ns |
19 | KADETQALPQRQKKQQTVTL | ** | *** | ns |
20 | GDAALALLLLDRLNQLESKM | ns | ns | ns |
Immunoreactivity of each sequence in the group was tested against the threshold calculated for the whole group using one-way ANOVA and the Dunnett’s multiple comparison test (α = 0.05): *P < .05; **P < .01; ***P < .001; ****P < .0001.
In case of IgA antibodies, overall profiles of immunoreactivity are less diverse than for IgG antibodies (Fig. 4). Less epitopes are recognized by IgA than IgG antibodies present in acute COVID-19 group sera. We calculated immunoreactivity thresholds for each of the tested groups and performed statistical analysis to indicate the N-protein epitopes recognized by IgA antibodies (Table 2). Again, the same three epitopes were identified in the pre-pandemic sera, which are numbered (asterisks indicate statistical significance): 14 KKSAAEASKKPRQKRTATKA (****), 9 SDSTGSNQNGERSGARSKQR (**), and 3 IGYYRRATRRIRGGDGKMKD (***).

Epitope mapping of SARS-CoV-2 N-protein—pattern of sequences recognized by IgA antibodies. Assay was performed five times, the data shown is means with ± standard deviation. Indicated thresholds for each tested group (dotted lines on the graph ) were calculated by counting the mean of all results within given group.
Peptide number . | Peptide sequence . | COVID-19 acute . | COVID-19 convalescents . | Pre-pandemic . |
---|---|---|---|---|
1 | MSDNGPQNQRNAPRITFGGP | ns | ns | ns |
2 | RPQGLPNNTASWFTALTQHG | ns | ns | ns |
3 | IGYYRRATRRIRGGDGKMKD | ns | ** | *** |
4 | LNTPKDHIGTRNPANNAAIV | ns | ns | ns |
5 | GSQASSRSSSRSRNSSRNST | ns | ns | ns |
6 | NQLESKMSGKGQQQQGQTVT | ns | ns | ns |
7 | RRGPEQTQGNFGDQELIRQG | ns | ns | ns |
8 | IAQFAPSASAFFGMSRIGME | ns | ns | ns |
9 | SDSTGSNQNGERSGARSKQR | **** | ns | ** |
10 | LKFPRGQGVPINTNSSPDDQ | ns | ns | ns |
11 | GTGPEAGLPYGANKDGIIWV | ns | ns | ns |
12 | LQLPQGTTLPKGFYAEGSRG | ns | ns | ns |
13 | TPGSSRGTSPARMAGNGGDA | ns | ns | ns |
14 | KKSAAEASKKPRQKRTATKA | *** | *** | **** |
15 | GDQELIRQGTDYKHWPQIAQ | ns | ns | ns |
16 | DPNFKDQVILLNKHIDAYKT | ns | ns | ns |
17 | FPPTEPKKDKKKKADETQAL | ns | ns | ns |
18 | LDDFSKQLQQSMSSADSTQA | ns | ns | ns |
19 | KADETQALPQRQKKQQTVTL | ns | ** | ns |
20 | GDAALALLLLDRLNQLESKM | ns | ns | ns |
Peptide number . | Peptide sequence . | COVID-19 acute . | COVID-19 convalescents . | Pre-pandemic . |
---|---|---|---|---|
1 | MSDNGPQNQRNAPRITFGGP | ns | ns | ns |
2 | RPQGLPNNTASWFTALTQHG | ns | ns | ns |
3 | IGYYRRATRRIRGGDGKMKD | ns | ** | *** |
4 | LNTPKDHIGTRNPANNAAIV | ns | ns | ns |
5 | GSQASSRSSSRSRNSSRNST | ns | ns | ns |
6 | NQLESKMSGKGQQQQGQTVT | ns | ns | ns |
7 | RRGPEQTQGNFGDQELIRQG | ns | ns | ns |
8 | IAQFAPSASAFFGMSRIGME | ns | ns | ns |
9 | SDSTGSNQNGERSGARSKQR | **** | ns | ** |
10 | LKFPRGQGVPINTNSSPDDQ | ns | ns | ns |
11 | GTGPEAGLPYGANKDGIIWV | ns | ns | ns |
12 | LQLPQGTTLPKGFYAEGSRG | ns | ns | ns |
13 | TPGSSRGTSPARMAGNGGDA | ns | ns | ns |
14 | KKSAAEASKKPRQKRTATKA | *** | *** | **** |
15 | GDQELIRQGTDYKHWPQIAQ | ns | ns | ns |
16 | DPNFKDQVILLNKHIDAYKT | ns | ns | ns |
17 | FPPTEPKKDKKKKADETQAL | ns | ns | ns |
18 | LDDFSKQLQQSMSSADSTQA | ns | ns | ns |
19 | KADETQALPQRQKKQQTVTL | ns | ** | ns |
20 | GDAALALLLLDRLNQLESKM | ns | ns | ns |
Immunoreactivity of each sequence in the group was tested against the threshold calculated for whole group using one-way ANOVA and the Dunnett’s multiple comparison test (α = 0.05): *P < .05; **P < .01; ***P < .001; ****P < .0001.
Peptide number . | Peptide sequence . | COVID-19 acute . | COVID-19 convalescents . | Pre-pandemic . |
---|---|---|---|---|
1 | MSDNGPQNQRNAPRITFGGP | ns | ns | ns |
2 | RPQGLPNNTASWFTALTQHG | ns | ns | ns |
3 | IGYYRRATRRIRGGDGKMKD | ns | ** | *** |
4 | LNTPKDHIGTRNPANNAAIV | ns | ns | ns |
5 | GSQASSRSSSRSRNSSRNST | ns | ns | ns |
6 | NQLESKMSGKGQQQQGQTVT | ns | ns | ns |
7 | RRGPEQTQGNFGDQELIRQG | ns | ns | ns |
8 | IAQFAPSASAFFGMSRIGME | ns | ns | ns |
9 | SDSTGSNQNGERSGARSKQR | **** | ns | ** |
10 | LKFPRGQGVPINTNSSPDDQ | ns | ns | ns |
11 | GTGPEAGLPYGANKDGIIWV | ns | ns | ns |
12 | LQLPQGTTLPKGFYAEGSRG | ns | ns | ns |
13 | TPGSSRGTSPARMAGNGGDA | ns | ns | ns |
14 | KKSAAEASKKPRQKRTATKA | *** | *** | **** |
15 | GDQELIRQGTDYKHWPQIAQ | ns | ns | ns |
16 | DPNFKDQVILLNKHIDAYKT | ns | ns | ns |
17 | FPPTEPKKDKKKKADETQAL | ns | ns | ns |
18 | LDDFSKQLQQSMSSADSTQA | ns | ns | ns |
19 | KADETQALPQRQKKQQTVTL | ns | ** | ns |
20 | GDAALALLLLDRLNQLESKM | ns | ns | ns |
Peptide number . | Peptide sequence . | COVID-19 acute . | COVID-19 convalescents . | Pre-pandemic . |
---|---|---|---|---|
1 | MSDNGPQNQRNAPRITFGGP | ns | ns | ns |
2 | RPQGLPNNTASWFTALTQHG | ns | ns | ns |
3 | IGYYRRATRRIRGGDGKMKD | ns | ** | *** |
4 | LNTPKDHIGTRNPANNAAIV | ns | ns | ns |
5 | GSQASSRSSSRSRNSSRNST | ns | ns | ns |
6 | NQLESKMSGKGQQQQGQTVT | ns | ns | ns |
7 | RRGPEQTQGNFGDQELIRQG | ns | ns | ns |
8 | IAQFAPSASAFFGMSRIGME | ns | ns | ns |
9 | SDSTGSNQNGERSGARSKQR | **** | ns | ** |
10 | LKFPRGQGVPINTNSSPDDQ | ns | ns | ns |
11 | GTGPEAGLPYGANKDGIIWV | ns | ns | ns |
12 | LQLPQGTTLPKGFYAEGSRG | ns | ns | ns |
13 | TPGSSRGTSPARMAGNGGDA | ns | ns | ns |
14 | KKSAAEASKKPRQKRTATKA | *** | *** | **** |
15 | GDQELIRQGTDYKHWPQIAQ | ns | ns | ns |
16 | DPNFKDQVILLNKHIDAYKT | ns | ns | ns |
17 | FPPTEPKKDKKKKADETQAL | ns | ns | ns |
18 | LDDFSKQLQQSMSSADSTQA | ns | ns | ns |
19 | KADETQALPQRQKKQQTVTL | ns | ** | ns |
20 | GDAALALLLLDRLNQLESKM | ns | ns | ns |
Immunoreactivity of each sequence in the group was tested against the threshold calculated for whole group using one-way ANOVA and the Dunnett’s multiple comparison test (α = 0.05): *P < .05; **P < .01; ***P < .001; ****P < .0001.
In order to elucidate why sequences 3, 9, and 14 are recognized by the pre-pandemic sera, we performed a BLAST analysis of these sequences (Table 3). BLAST analysis showed that KKSAAEASKKPRQKRTATKA sequence is widely spread among other SARS-like coronaviruses as well as bat coronaviruses like Rhinolophus affinis coronavirus or BtRs BetaCoV. Similar patterns are also found in TCP domain-containing protein of various plants like melon, soybean, clementine, sesame, Colorado blue columbine, or silver poplar. SDSTGSNQNGERSGARSKQR sequence was shown to be similar for many coronaviruses. A sequence SNQNGXRSGARS was found in AA_TRNA_LIGASE_II domain-containing protein in citrus. IGYYRRATRRIRGGDGKMKD sequence was highly conserved for other coronaviruses. Shorter sequence RRATRRIRG was found in ant multiple coagulation factor deficiency protein 2-like protein. TRRIRG pattern was also found in death domain-containing protein from Streptomyces sp. CB01883 and glycogen synthase from Planctomycetes bacterium PlA133.
Protein . | Host . | Shared amino acids . |
---|---|---|
KKSAAEASKKPRQKRTATKA | ||
Nucleoprotein | SARS-CoV, Bat coronavirus Rp3/2004, Bat coronavirus HKU3, Bat coronavirus 279/2005, SARS coronavirus PUMC02, SARS coronavirus PUMC03, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, Bat coronavirus Rp/Shaanxi2011 | KKSAAEASKKPRQKRTATKS |
Nucleoprotein | Bat SARS CoV Rf1/2004 | KKSTSEASKKPRKRTATKQ |
Nucleoprotein | Betacoronavirus Erinaceus/VMC/DEU/2012 | KKDAADAKKKMRHKRVATKA |
Nucleoprotein | Bat Hp-betacoronavirus/Zhejiang2013 | KKTAAEIAAKPRQKRVAHKG |
NHEJ DNA polymerase | Flavisolibacter ginsenosidimutans | KKTAA----KPRQKRSATKA |
TCP domain-containing protein | Quercus lobate, Ziziphus jujube, Tripterygium wilfordii | XXXAAEASKKPPPKRTSTKD |
TCP domain-containing protein | Handroanthus impetiginosus, Cucumis melo | TIAAAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Salvia splendens | IITAPEASKKPPPKRTSTKD |
TCP domain-containing protein | Malus baccata, Pyrus ussuriensis x Pyrus communis | QASSAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Aquilegia coerulea, Thalictrum thalictroides | SLQIAETSKKPPPKRTSTKD |
TCP domain-containing protein | Populus alba, Populus trichocarpa, Prunus avium, Prunus dulcis, Prunus armeniaca, Prunus persica | XXXXAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Citrus clementina | ESSAAAAAKKPVPKRTSTKD |
Transcription factor TCP14 isoform A | Glycine max, Glycine soja | SAAEASAEAANSKKPPPKRTSTK |
transcription factor TCP15 | Herrania umbratica | AKTASEAPKKPPPKRTSTKD |
transcription factor TCP15-like | Durio zibethinus | SKATSEASKKPPPKRTPTKD |
Transcription factor TCP14 | Cucumis melo var. makuwa, Vitis vinifera, Sesamum indicum, Rosa chinensis, Camellia sinensis var. sinensis | XXXAAEXSKKPXXKRTXTKX |
DNA polymerase I | Spirosoma taeanense | SPAAAEPAKKPRAKRTAVKA |
Bifunctional lysine-specific demethylase and histidyl-hydroxylase | Phytophthora rubi | ANDSAAEPSKKQKKVATKAS |
RRM domain-containing protein | Thamnocephalis sphaerospora | VKSAAEATQKPRQKFYIPPF |
SDSTGSNQNGERSGARSKQR | ||
Nucleoprotein | Bat coronavirus 279/2005, Bat SARS CoV Rf1/2004, Bat coronavirus Rp/Shaanxi2011 | SDSTDNNQDGGRSGARPKQR |
Nucleoprotein | SARS-CoV, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, SARS coronavirus PUMC02, SARS coronavirus PUMC03 | TDSTDNNQNGGRNGARPKQR |
Nucleoprotein | Bat coronavirus Rp3/2004 | TDSTDNNQDGGRNGARPKQR |
Nucleoprotein | Bat coronavirus HKU3 | ADSNDNNQDGGRSGARPKQR |
AA_TRNA_LIGASE_II domain-containing protein | Citrus sinensis, Citrus unshiu | ALSSASNQNGGRSGARSLSP |
AcrR family transcriptional regulator | Microlunatus parietis | LLLVALTQNGERAGARVRQR |
IGYYRRATRRIRGGDGKMKD | ||
Nucleoprotein | SARS-CoV, Bat coronavirus Rp3/2004, Bat coronavirus HKU3, SARS coronavirus PUMC02, SARS coronavirus PUMC03, Bat SARS CoV Rf1/2004, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, Bat coronavirus Rp/Shaanxi2011, Bat coronavirus 279/2005 | IGYYRRATRRVRGGDGKMKE |
Multiple coagulation factor deficiency protein 2-like protein | Temnothorax longispinosus | IDYERRATRRIRGSTLTRDA |
Death domain-containing protein | Streptomyces sp. CB01883 | QRQYRRETRRIRGRHATAA |
Glycogen synthase | Planctomycetes bacterium PlA133 | LKAYRRVTRRIRGR |
Uncharacterized protein | Handroanthus impetiginosus, Dorcoceras hygrometricum, Sesamum indicum | WDPYYYRR–RRVREGDGGMNF |
Uncharacterized protein | Phragmitibacter flavus | VGYYRRGVRRI-RSGDFYTSV |
Protein . | Host . | Shared amino acids . |
---|---|---|
KKSAAEASKKPRQKRTATKA | ||
Nucleoprotein | SARS-CoV, Bat coronavirus Rp3/2004, Bat coronavirus HKU3, Bat coronavirus 279/2005, SARS coronavirus PUMC02, SARS coronavirus PUMC03, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, Bat coronavirus Rp/Shaanxi2011 | KKSAAEASKKPRQKRTATKS |
Nucleoprotein | Bat SARS CoV Rf1/2004 | KKSTSEASKKPRKRTATKQ |
Nucleoprotein | Betacoronavirus Erinaceus/VMC/DEU/2012 | KKDAADAKKKMRHKRVATKA |
Nucleoprotein | Bat Hp-betacoronavirus/Zhejiang2013 | KKTAAEIAAKPRQKRVAHKG |
NHEJ DNA polymerase | Flavisolibacter ginsenosidimutans | KKTAA----KPRQKRSATKA |
TCP domain-containing protein | Quercus lobate, Ziziphus jujube, Tripterygium wilfordii | XXXAAEASKKPPPKRTSTKD |
TCP domain-containing protein | Handroanthus impetiginosus, Cucumis melo | TIAAAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Salvia splendens | IITAPEASKKPPPKRTSTKD |
TCP domain-containing protein | Malus baccata, Pyrus ussuriensis x Pyrus communis | QASSAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Aquilegia coerulea, Thalictrum thalictroides | SLQIAETSKKPPPKRTSTKD |
TCP domain-containing protein | Populus alba, Populus trichocarpa, Prunus avium, Prunus dulcis, Prunus armeniaca, Prunus persica | XXXXAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Citrus clementina | ESSAAAAAKKPVPKRTSTKD |
Transcription factor TCP14 isoform A | Glycine max, Glycine soja | SAAEASAEAANSKKPPPKRTSTK |
transcription factor TCP15 | Herrania umbratica | AKTASEAPKKPPPKRTSTKD |
transcription factor TCP15-like | Durio zibethinus | SKATSEASKKPPPKRTPTKD |
Transcription factor TCP14 | Cucumis melo var. makuwa, Vitis vinifera, Sesamum indicum, Rosa chinensis, Camellia sinensis var. sinensis | XXXAAEXSKKPXXKRTXTKX |
DNA polymerase I | Spirosoma taeanense | SPAAAEPAKKPRAKRTAVKA |
Bifunctional lysine-specific demethylase and histidyl-hydroxylase | Phytophthora rubi | ANDSAAEPSKKQKKVATKAS |
RRM domain-containing protein | Thamnocephalis sphaerospora | VKSAAEATQKPRQKFYIPPF |
SDSTGSNQNGERSGARSKQR | ||
Nucleoprotein | Bat coronavirus 279/2005, Bat SARS CoV Rf1/2004, Bat coronavirus Rp/Shaanxi2011 | SDSTDNNQDGGRSGARPKQR |
Nucleoprotein | SARS-CoV, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, SARS coronavirus PUMC02, SARS coronavirus PUMC03 | TDSTDNNQNGGRNGARPKQR |
Nucleoprotein | Bat coronavirus Rp3/2004 | TDSTDNNQDGGRNGARPKQR |
Nucleoprotein | Bat coronavirus HKU3 | ADSNDNNQDGGRSGARPKQR |
AA_TRNA_LIGASE_II domain-containing protein | Citrus sinensis, Citrus unshiu | ALSSASNQNGGRSGARSLSP |
AcrR family transcriptional regulator | Microlunatus parietis | LLLVALTQNGERAGARVRQR |
IGYYRRATRRIRGGDGKMKD | ||
Nucleoprotein | SARS-CoV, Bat coronavirus Rp3/2004, Bat coronavirus HKU3, SARS coronavirus PUMC02, SARS coronavirus PUMC03, Bat SARS CoV Rf1/2004, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, Bat coronavirus Rp/Shaanxi2011, Bat coronavirus 279/2005 | IGYYRRATRRVRGGDGKMKE |
Multiple coagulation factor deficiency protein 2-like protein | Temnothorax longispinosus | IDYERRATRRIRGSTLTRDA |
Death domain-containing protein | Streptomyces sp. CB01883 | QRQYRRETRRIRGRHATAA |
Glycogen synthase | Planctomycetes bacterium PlA133 | LKAYRRVTRRIRGR |
Uncharacterized protein | Handroanthus impetiginosus, Dorcoceras hygrometricum, Sesamum indicum | WDPYYYRR–RRVREGDGGMNF |
Uncharacterized protein | Phragmitibacter flavus | VGYYRRGVRRI-RSGDFYTSV |
Identical amino acids are indicated in bold. X means variable amino acids.
Protein . | Host . | Shared amino acids . |
---|---|---|
KKSAAEASKKPRQKRTATKA | ||
Nucleoprotein | SARS-CoV, Bat coronavirus Rp3/2004, Bat coronavirus HKU3, Bat coronavirus 279/2005, SARS coronavirus PUMC02, SARS coronavirus PUMC03, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, Bat coronavirus Rp/Shaanxi2011 | KKSAAEASKKPRQKRTATKS |
Nucleoprotein | Bat SARS CoV Rf1/2004 | KKSTSEASKKPRKRTATKQ |
Nucleoprotein | Betacoronavirus Erinaceus/VMC/DEU/2012 | KKDAADAKKKMRHKRVATKA |
Nucleoprotein | Bat Hp-betacoronavirus/Zhejiang2013 | KKTAAEIAAKPRQKRVAHKG |
NHEJ DNA polymerase | Flavisolibacter ginsenosidimutans | KKTAA----KPRQKRSATKA |
TCP domain-containing protein | Quercus lobate, Ziziphus jujube, Tripterygium wilfordii | XXXAAEASKKPPPKRTSTKD |
TCP domain-containing protein | Handroanthus impetiginosus, Cucumis melo | TIAAAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Salvia splendens | IITAPEASKKPPPKRTSTKD |
TCP domain-containing protein | Malus baccata, Pyrus ussuriensis x Pyrus communis | QASSAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Aquilegia coerulea, Thalictrum thalictroides | SLQIAETSKKPPPKRTSTKD |
TCP domain-containing protein | Populus alba, Populus trichocarpa, Prunus avium, Prunus dulcis, Prunus armeniaca, Prunus persica | XXXXAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Citrus clementina | ESSAAAAAKKPVPKRTSTKD |
Transcription factor TCP14 isoform A | Glycine max, Glycine soja | SAAEASAEAANSKKPPPKRTSTK |
transcription factor TCP15 | Herrania umbratica | AKTASEAPKKPPPKRTSTKD |
transcription factor TCP15-like | Durio zibethinus | SKATSEASKKPPPKRTPTKD |
Transcription factor TCP14 | Cucumis melo var. makuwa, Vitis vinifera, Sesamum indicum, Rosa chinensis, Camellia sinensis var. sinensis | XXXAAEXSKKPXXKRTXTKX |
DNA polymerase I | Spirosoma taeanense | SPAAAEPAKKPRAKRTAVKA |
Bifunctional lysine-specific demethylase and histidyl-hydroxylase | Phytophthora rubi | ANDSAAEPSKKQKKVATKAS |
RRM domain-containing protein | Thamnocephalis sphaerospora | VKSAAEATQKPRQKFYIPPF |
SDSTGSNQNGERSGARSKQR | ||
Nucleoprotein | Bat coronavirus 279/2005, Bat SARS CoV Rf1/2004, Bat coronavirus Rp/Shaanxi2011 | SDSTDNNQDGGRSGARPKQR |
Nucleoprotein | SARS-CoV, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, SARS coronavirus PUMC02, SARS coronavirus PUMC03 | TDSTDNNQNGGRNGARPKQR |
Nucleoprotein | Bat coronavirus Rp3/2004 | TDSTDNNQDGGRNGARPKQR |
Nucleoprotein | Bat coronavirus HKU3 | ADSNDNNQDGGRSGARPKQR |
AA_TRNA_LIGASE_II domain-containing protein | Citrus sinensis, Citrus unshiu | ALSSASNQNGGRSGARSLSP |
AcrR family transcriptional regulator | Microlunatus parietis | LLLVALTQNGERAGARVRQR |
IGYYRRATRRIRGGDGKMKD | ||
Nucleoprotein | SARS-CoV, Bat coronavirus Rp3/2004, Bat coronavirus HKU3, SARS coronavirus PUMC02, SARS coronavirus PUMC03, Bat SARS CoV Rf1/2004, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, Bat coronavirus Rp/Shaanxi2011, Bat coronavirus 279/2005 | IGYYRRATRRVRGGDGKMKE |
Multiple coagulation factor deficiency protein 2-like protein | Temnothorax longispinosus | IDYERRATRRIRGSTLTRDA |
Death domain-containing protein | Streptomyces sp. CB01883 | QRQYRRETRRIRGRHATAA |
Glycogen synthase | Planctomycetes bacterium PlA133 | LKAYRRVTRRIRGR |
Uncharacterized protein | Handroanthus impetiginosus, Dorcoceras hygrometricum, Sesamum indicum | WDPYYYRR–RRVREGDGGMNF |
Uncharacterized protein | Phragmitibacter flavus | VGYYRRGVRRI-RSGDFYTSV |
Protein . | Host . | Shared amino acids . |
---|---|---|
KKSAAEASKKPRQKRTATKA | ||
Nucleoprotein | SARS-CoV, Bat coronavirus Rp3/2004, Bat coronavirus HKU3, Bat coronavirus 279/2005, SARS coronavirus PUMC02, SARS coronavirus PUMC03, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, Bat coronavirus Rp/Shaanxi2011 | KKSAAEASKKPRQKRTATKS |
Nucleoprotein | Bat SARS CoV Rf1/2004 | KKSTSEASKKPRKRTATKQ |
Nucleoprotein | Betacoronavirus Erinaceus/VMC/DEU/2012 | KKDAADAKKKMRHKRVATKA |
Nucleoprotein | Bat Hp-betacoronavirus/Zhejiang2013 | KKTAAEIAAKPRQKRVAHKG |
NHEJ DNA polymerase | Flavisolibacter ginsenosidimutans | KKTAA----KPRQKRSATKA |
TCP domain-containing protein | Quercus lobate, Ziziphus jujube, Tripterygium wilfordii | XXXAAEASKKPPPKRTSTKD |
TCP domain-containing protein | Handroanthus impetiginosus, Cucumis melo | TIAAAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Salvia splendens | IITAPEASKKPPPKRTSTKD |
TCP domain-containing protein | Malus baccata, Pyrus ussuriensis x Pyrus communis | QASSAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Aquilegia coerulea, Thalictrum thalictroides | SLQIAETSKKPPPKRTSTKD |
TCP domain-containing protein | Populus alba, Populus trichocarpa, Prunus avium, Prunus dulcis, Prunus armeniaca, Prunus persica | XXXXAEPSKKPPPKRTSTKD |
TCP domain-containing protein | Citrus clementina | ESSAAAAAKKPVPKRTSTKD |
Transcription factor TCP14 isoform A | Glycine max, Glycine soja | SAAEASAEAANSKKPPPKRTSTK |
transcription factor TCP15 | Herrania umbratica | AKTASEAPKKPPPKRTSTKD |
transcription factor TCP15-like | Durio zibethinus | SKATSEASKKPPPKRTPTKD |
Transcription factor TCP14 | Cucumis melo var. makuwa, Vitis vinifera, Sesamum indicum, Rosa chinensis, Camellia sinensis var. sinensis | XXXAAEXSKKPXXKRTXTKX |
DNA polymerase I | Spirosoma taeanense | SPAAAEPAKKPRAKRTAVKA |
Bifunctional lysine-specific demethylase and histidyl-hydroxylase | Phytophthora rubi | ANDSAAEPSKKQKKVATKAS |
RRM domain-containing protein | Thamnocephalis sphaerospora | VKSAAEATQKPRQKFYIPPF |
SDSTGSNQNGERSGARSKQR | ||
Nucleoprotein | Bat coronavirus 279/2005, Bat SARS CoV Rf1/2004, Bat coronavirus Rp/Shaanxi2011 | SDSTDNNQDGGRSGARPKQR |
Nucleoprotein | SARS-CoV, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, SARS coronavirus PUMC02, SARS coronavirus PUMC03 | TDSTDNNQNGGRNGARPKQR |
Nucleoprotein | Bat coronavirus Rp3/2004 | TDSTDNNQDGGRNGARPKQR |
Nucleoprotein | Bat coronavirus HKU3 | ADSNDNNQDGGRSGARPKQR |
AA_TRNA_LIGASE_II domain-containing protein | Citrus sinensis, Citrus unshiu | ALSSASNQNGGRSGARSLSP |
AcrR family transcriptional regulator | Microlunatus parietis | LLLVALTQNGERAGARVRQR |
IGYYRRATRRIRGGDGKMKD | ||
Nucleoprotein | SARS-CoV, Bat coronavirus Rp3/2004, Bat coronavirus HKU3, SARS coronavirus PUMC02, SARS coronavirus PUMC03, Bat SARS CoV Rf1/2004, Bat SARS-like coronavirus YNLF_31C, Bat SARS-like coronavirus YNLF_34C, Bat coronavirus Rp/Shaanxi2011, Bat coronavirus 279/2005 | IGYYRRATRRVRGGDGKMKE |
Multiple coagulation factor deficiency protein 2-like protein | Temnothorax longispinosus | IDYERRATRRIRGSTLTRDA |
Death domain-containing protein | Streptomyces sp. CB01883 | QRQYRRETRRIRGRHATAA |
Glycogen synthase | Planctomycetes bacterium PlA133 | LKAYRRVTRRIRGR |
Uncharacterized protein | Handroanthus impetiginosus, Dorcoceras hygrometricum, Sesamum indicum | WDPYYYRR–RRVREGDGGMNF |
Uncharacterized protein | Phragmitibacter flavus | VGYYRRGVRRI-RSGDFYTSV |
Identical amino acids are indicated in bold. X means variable amino acids.
We performed a search of already known epitopes deposited in IEDB database similar to KKSAAEASKKPRQKRTATKA, SDSTGSNQNGERSGARSKQR, and IGYYRRATRRIRGGDGKMKD (Table 4). Again, sequences related to KKSAAEASKKPRQKRTATKA were the most common. Partial epitopes were found in human proteins like nerve injury induced protein-1 (ninjurin-1), mediator of DNA damage checkpoint protein (mdc1), surfeit locus protein 1 (surf-1), and phosphofurin acidic cluster sorting protein 1 (pacs1). SDSTGS sequence is shared between SARS-CoV-2 N protein and known epitope from trans-Golgi network integral membrane protein (tgoln2). Two similar to GYYRRATRRIRGGDGKMKD sequences, namely GYYRRA and YYRRAT, were found in two known epitopes in human serine/threonine-protein phosphatase 5 (ppp5c) and a chaperone dnaJ homolog subfamily C member 3 (dnajc3), respectively.
Protein . | Host . | Shared amino acids . |
---|---|---|
KKSAAEASKKPRQKRTATKA | ||
Nucleoprotein | Human coronavirus HKU1 | ILMKPRQKRTPN |
Ninjurin-1 | Homo sapiens | VNHYASKKSAAESM |
Mediator of DNA damage checkpoint protein 1 | Homo sapiens | AVLALGGSLAGSAAEASHLVTDR |
Cytoadherence linked asexual protein 3.1 | Plasmodium falciparum | HGLAAEASKYLFFYFFTNLYLDAYKSFPGG |
Surfeit locus protein 1 | Homo sapiens | GSSAAEASATKAEDDSF |
Phosphofurin acidic cluster sorting protein 1 | Homo sapiens | KVPTIFLSKKPREKE |
Nucleoprotein | Betacoronavirus 1 | LNKPRQKRSPNKQCT |
SDSTGSNQNGERSGARSKQR | ||
Trans-Golgi network integral membrane protein | Homo sapiens | SDSTGSEKDDLYPN |
IGYYRRATRRIRGGDGKMKD | ||
Serine/threonine-protein phosphatase 5 | Homo sapiens | IKGYYRRAASNMALGK |
DnaJ homolog subfamily C member 3 | Homo sapiens | IAYYRRATVFLAMGKS |
Protein . | Host . | Shared amino acids . |
---|---|---|
KKSAAEASKKPRQKRTATKA | ||
Nucleoprotein | Human coronavirus HKU1 | ILMKPRQKRTPN |
Ninjurin-1 | Homo sapiens | VNHYASKKSAAESM |
Mediator of DNA damage checkpoint protein 1 | Homo sapiens | AVLALGGSLAGSAAEASHLVTDR |
Cytoadherence linked asexual protein 3.1 | Plasmodium falciparum | HGLAAEASKYLFFYFFTNLYLDAYKSFPGG |
Surfeit locus protein 1 | Homo sapiens | GSSAAEASATKAEDDSF |
Phosphofurin acidic cluster sorting protein 1 | Homo sapiens | KVPTIFLSKKPREKE |
Nucleoprotein | Betacoronavirus 1 | LNKPRQKRSPNKQCT |
SDSTGSNQNGERSGARSKQR | ||
Trans-Golgi network integral membrane protein | Homo sapiens | SDSTGSEKDDLYPN |
IGYYRRATRRIRGGDGKMKD | ||
Serine/threonine-protein phosphatase 5 | Homo sapiens | IKGYYRRAASNMALGK |
DnaJ homolog subfamily C member 3 | Homo sapiens | IAYYRRATVFLAMGKS |
Shared aminoacids are indicated in bold.
Protein . | Host . | Shared amino acids . |
---|---|---|
KKSAAEASKKPRQKRTATKA | ||
Nucleoprotein | Human coronavirus HKU1 | ILMKPRQKRTPN |
Ninjurin-1 | Homo sapiens | VNHYASKKSAAESM |
Mediator of DNA damage checkpoint protein 1 | Homo sapiens | AVLALGGSLAGSAAEASHLVTDR |
Cytoadherence linked asexual protein 3.1 | Plasmodium falciparum | HGLAAEASKYLFFYFFTNLYLDAYKSFPGG |
Surfeit locus protein 1 | Homo sapiens | GSSAAEASATKAEDDSF |
Phosphofurin acidic cluster sorting protein 1 | Homo sapiens | KVPTIFLSKKPREKE |
Nucleoprotein | Betacoronavirus 1 | LNKPRQKRSPNKQCT |
SDSTGSNQNGERSGARSKQR | ||
Trans-Golgi network integral membrane protein | Homo sapiens | SDSTGSEKDDLYPN |
IGYYRRATRRIRGGDGKMKD | ||
Serine/threonine-protein phosphatase 5 | Homo sapiens | IKGYYRRAASNMALGK |
DnaJ homolog subfamily C member 3 | Homo sapiens | IAYYRRATVFLAMGKS |
Protein . | Host . | Shared amino acids . |
---|---|---|
KKSAAEASKKPRQKRTATKA | ||
Nucleoprotein | Human coronavirus HKU1 | ILMKPRQKRTPN |
Ninjurin-1 | Homo sapiens | VNHYASKKSAAESM |
Mediator of DNA damage checkpoint protein 1 | Homo sapiens | AVLALGGSLAGSAAEASHLVTDR |
Cytoadherence linked asexual protein 3.1 | Plasmodium falciparum | HGLAAEASKYLFFYFFTNLYLDAYKSFPGG |
Surfeit locus protein 1 | Homo sapiens | GSSAAEASATKAEDDSF |
Phosphofurin acidic cluster sorting protein 1 | Homo sapiens | KVPTIFLSKKPREKE |
Nucleoprotein | Betacoronavirus 1 | LNKPRQKRSPNKQCT |
SDSTGSNQNGERSGARSKQR | ||
Trans-Golgi network integral membrane protein | Homo sapiens | SDSTGSEKDDLYPN |
IGYYRRATRRIRGGDGKMKD | ||
Serine/threonine-protein phosphatase 5 | Homo sapiens | IKGYYRRAASNMALGK |
DnaJ homolog subfamily C member 3 | Homo sapiens | IAYYRRATVFLAMGKS |
Shared aminoacids are indicated in bold.
Discussion
There are multiple possible explanations of pre-pandemic immune response against SARS-CoV-2. Some authors propose that antibodies acquired during a common cold HCoV infection cross-react with SARS-CoV-2 antigens. Some suggest that the pre-existing immunity can be induced by other antigens like those that could be found in vaccines (Reche 2020) or commensal bacteria (Jia et al. 2022). Learning and understanding these mechanisms is essential to design safe and effective therapies and vaccines. In this manuscript, we further developed the cross-reactivity issue and found out that there might be additional sources of pre-pandemic immunity, such as antibody reactivity with self-proteins or other proteins that are existing in our environment.
In the course of our studies, we detected SARS-CoV-2 nucleocapsid cross-reactive IgA antibodies in a pre-pandemic serum isolated from Polish patients (Fig. 1A). At the same time, there were no specific IgG antibodies, which could point to a pre-existing mucosal immunity based solely on IgA antibodies. This finding is in line with what was already published. Cross-reactive IgA antibodies were detected in the pre-pandemic milk of African and United States mothers (Egwang et al. 2021). Egwang et al. tested the levels of human milk SARS-CoV-2 and HCoV anti-spike protein antibodies by ELISA. US mothers had higher prevalence of milk IgA antibodies against alphacoronaviruses, whereas African mothers against betacoronaviruses, which aligns with geographic distribution of these viruses. In our study, we observed cross-reactive IgA antibodies against nucleocapsid without specific anti-spike antibodies (data not shown). This could be due to the methodology used by us. Namely, we used western blot instead of ELISA, which is aimed for linear epitopes, in contrast to ELISA, which aims both linear and conformational epitopes. Conformational epitopes are typical for spike (S) protein since reduced protein is significantly less immunoreactive than non-reduced one. In case of N-protein, both protein variants are immunoreactive at a comparable levels (Maghsood et al. 2022).
To further study the N-protein immunoreactivity, we performed ELISA on a bigger group of pre-pandemic sera to assess the antibody response also against the non-linear epitopes of the SARS-CoV-2 nucleocapsid. The overall immunoreactivity of pre-pandemic sera is significantly lower when compared to acute and convalescent COVID-19 samples. However, in case of IgA antibodies, we observed some particular readouts to be as high as in the acute COVID-19 group (Fig. 1B). Again, this could point to some pre-existing immunity to SARS-CoV-2 confirmed for the first time in Polish patients.
The pre-existing SARS-CoV-2 cross-reactive antibodies can have an impact on COVID-19 vaccination. It was shown that S2 protein was the main target of pre-pandemic immunity in healthy human and specific pathogen-free mice. Jia et al. have shown that the generation of S2 protein cross-reactive antibodies was associated with commensal gut bacteria and influenced the level of Receptor Binding Domain (RBD) specific binding antibody titers after SARS-CoV-2 vaccination in human. In return, SARS-CoV-2 S DNA vaccination altered the gut microbiota composition of mice (Jia et al. 2022).
Another interesting observation is that the antibody response against nucleocapsid is higher in the convalescent group than it is in the acute group, especially for IgG antibodies (Fig. 1B). Similar observation was made in case of S protein by Isho et al. (2020). They have shown that the peak of anti-S SARS-CoV-2 IgG levels appears between 16 and 30 days after the onset of symptoms, and the level sustains over 115 days. Anti-S IgA levels were much less sustained since they significantly declined after reaching the peak between days 16 and 30. The levels of anti-N antibodies closely resembled the ones measured for S protein (Isho et al. 2020). However, when we compared the acute versus convalescent antibody response using western blots, it was clearly visible that in case of IgG antibodies, the response is intensified in the acute group (Fig. 2).
Epitope mapping using a combined approach by firstly identifying epitopes in silico and then verifying them by using patients sera of different patient groups, focusing on IgG antibodies, revealed high variability in immunoreactivity profiles (Fig. 3). Immunoreactivity levels of IgGs in pre-pandemic sera with synthesized peptides are much lower than for the other tested groups. However, in case of IgA antibodies, the levels of immunoreactivity are comparable and overall profiles are similar (Fig. 4). We were able to identify seven epitopes recognized by IgG antibodies from acute COVID-19 sera, three by convalescent sera, and three by pre-pandemic sera. One epitope KKSAAEASKKPRQKRTATKA was recognized by IgG antibodies from all of the three tested groups. KKSAAEASKKPRQKRTATKA sequence is spread among various coronaviruses (Table 3) also its shorter versions are very common in plant proteins like TCP domain containing proteins. The TCP domain is highly conserved and found in plant transcriptional factors regulating multiple growth-related processes (Manassero et al. 2013). Some parts of KKSAAEASKKPRQKRTATKA sequence were also found in known epitopes, e.g. in a human protein ninjurin1. Ninjurin1 is a transmembrane protein expressed mostly in endothelial and myeloid cells and is induced under inflammatory conditions. Ninjurin1 plays a role in systemic inflammation by mediating leukocyte migration and modulating Toll-like receptor 4-dependent expression of inflammatory mediators (Jannewein et al. 2015), also is involved in programmed and necrotic cell death (Weerasinghe-Mudiyanselage et al. 2021). In severe COVID-19 patients, it was already shown that ninjurin1 is expressed in excessive amounts by macrophages, which may intensify the systemic inflammation (Xu et al. 2022). In this case, it is difficult to determine the cause-and-effect relationship. SAAEAS motif was already recognized by other studies as a shared one between SARS-CoV-2 and surfeit locus protein 1, which is a component of a complex that is essential for the generation of respiratory rhythm in humans (Lucchese and Flӧel 2020). Lucchcese and Flӧel hypothesis was that immunological targeting of SURF1 and two other proteins might contribute to brainstem-related respiratory failure in COVID-19 patients. Here, we have shown that patients have antibodies targeting KKSAAEASKKPRQKRTATKA sequence harbouring SAAEAS motif. Another protein with similar epitope sequence to KKSAAEASKKPRQKRTATKA is posphofurin acidic cluster sorting protein 1 (PACS-1) that is a membrane traffic regulator (Youker et al. 2009). PACS-1 was shown to be used by viruses for immune evasion, multiplication, and pathogenesis (Thomas et al. 2017).
There are other studies concerning N-protein B-cell epitope mapping. Many of them are performed in silico without subsequent empirical verification (Dai et al. 2020, Rakib et al. 2020, Tilocca et al. 2020, Kumar et al. 2021). Others use peptide synthesis and ELISA for epitope validation. Amrun et al. used peptide-ELISA to identify a sequence NNAAIVLQLPQGTTLPKG as an immunodominant IgG linear B-cell epitope that was strongly associated with disease severity (Amrun et al. 2020). This epitope is located on RNA-binding terminal domain of N protein. In our analysis, the same sequence appeared but was cut into two sequences: (4) LNTPKDHIGTRNPANNAAIV and (12) LQLPQGTTLPKGFYAEGSRG, and none of these was identified as epitope probably due to its cleavage. Another approach to epitope mapping involves monoclonal antibody production. Tian et al. immunized mice with N-protein expressed in insect cells and obtained six anti-N monoclonal antibodies. They identified DFSKQLQQ as a novel, conserved B-cell epitope (Tian et al. 2022). DFSKQLQQ sequence is a part of peptide (18) LDDFSKQLQQSMSSADSTQA tested by us but was not identified as an epitope using serum-mapping approach. Some studies deal also with the cross-recognition of SARS-CoV-2 N-protein. Cross-reactivity PIWAS analysis performed by Haynes et al. revealed two epitopes (within sequence 12 LQLPQGTTLPKGFYAEGSRG and sequence 18 LDDFSKQLQQSMSSADSTQA) (Haynes et al. 2021); however, none of those were identified as epitopes in our study. The differences may be due to variability typical of the studied populations, which should be taken into account when preparing and testing the effectiveness of vaccines.
Another result of our research is the set of epitopes described for the N-protein present in the Polish group of patients who underwent COVID-19 infection. Sequences: MSDNGPQNQRNAPRITFGGP and KADETQALPQRQKKQQTVTL are recognized both by sera antibodies from patients in acute phase of infection and, what is more important, by convalescents indicating that these antibodies are preserved in the bloodstream after infection. These sequences were not characterized by us as epitopes for the healthy group, which means that there is no pre-pandemic immunity against them and that they can be considered as suitable vaccine antigens specific for SARS-CoV-2. The use of specific epitopes, which have been subjected to prior analysis of possible cross-reactions, is much safer than the use of a complete protein. This approach reduces the risk of inducing an autoimmune response via the molecular mimicry mechanism (Segal and Shoenfeld 2018).
The limitation of our study is that sera was collected from hospitalized patients, so this study is limited to moderate and severe cases of COVID-19. It was previously shown that different epitope signals are observed in mild, moderate, and severe cases of COVID-19 (Haynes et al. 2021). Also, we were unable to identify exactly when the first symptoms of illness appeared. We are also aware that we used pooled sera in the epitope mapping analysis. As it was shown in western blot analysis of the immunoreactivity of the patients sera, the immunological response might depend on individual patients. Nevertheless, it is worth to underline that the differences among tested groups were clearly observed.
To sum up, we showed the presence of antibodies in a pre-pandemic serum isolated from Polish patients that could be pointed to pre-existing immunity to SARS-CoV-2. Moreover, we showed that the immunoreactive profile of the response to protein N is changeable during the course of the infection. The epitope mapping analysis has revealed the presence of KKSAAEASKKPRQKRTATKA epitope recognized by antibodies in sera from all studied groups. This peptide shares sequences with other proteins i.e. various coronaviruses, of plant or human origin indicating that pre-existing immunity against SARS-CoV-2 might be due to the presence of cross-reactivity of antibodies against proteins other than virus. This finding makes us reconsider the use of inactivated vaccines, which contain the entire nucleocapsid and, at the same time, non-specific sequences common to many organisms, including humans. Most importantly, our study indicated the peptide sequences that are unique and recognized by sera antibodies from convalescents, making them an interesting target as a potential safer vaccine antigens specific for SARS-CoV-2.
Author contributions
Agnieszka Razim (Investigation, Methodology, Writing – original draft, Writing – review & editing), Katarzyna Pacyga-Prus (Investigation), Wioletta Kazana-Płuszka (Investigation), Agnieszka Zabłocka (Investigation), Józefa Macała (Investigation), Hubert Ciepłucha (Investigation), Andrzej Gamian (Funding acquisition, Resources, Writing – review & editing), and Sabina Górska (Conceptualization, Methodology, Project administration, Supervision, Writing – review & editing)
Conflict of interest
None declared.
Funding
This work was supported by the National Centre for Research and Development under grant Szpitale jednoimienne 28/2020 (A.G.); Foundation for Polish Science (FNP) under grant START2022 (A.R.).