Article Type : Research Article
Authors : Chakraborty AK
Keywords : Nsp1 protein; Genome deletion; Deletion boundary oligos; ORF1ab protein; ORF7a/b deletions; SARS-CoV-2
The Wuhan corona
virus was mutated and deleted at different positions of the genome specifically
in deadly Alpha and Delta variants whereas spike protein was mostly affected in
Omicron variants. The nsp1 protein (180 AAs) is the first protein of ORF1ab
polyprotein which was degraded in host into sixteen (nsp1-nsp16) polypeptides
with diverse functions. The most popular deletion was 3675SGF in the nsp6
domain which was first appeared in early 2021 in B.1.1.7, B.1.351 and
B.1.1.28.1 variants but now carried into most Omicron variants. We investigated
here the deletion in the nsp1 protein which interacted with many cellular
proteins preventing viral clearance. A 141KSF three amino acids deletion in nsp1
was persistent in all Omicron BA.4 variant while another 82GHVMV five amino
acids deletion was detected at the upstream of 141KSF in some recent isolates.
BLAST-N search with 82GHVMV oligo gave no 141KSF deletion mutant but selection
with GHVMV-KSF oligo gave all 82GHVMV plus 141KSF deletion mutants including
3675SGF (ORF1ab), 31ERS (N-protein), 24LPP (Spike) as well as 26nt 3’-UTR
deletions. Sequences surrounding 82GHVMV and 141KSF deletions formed hairpin
structures that were changed in deletion mutants and 3-D structure of mutant
nsp1 was also changed. Previously, we showed the frequent deletions in ORF7a
and ORF7b as well as termination codon mutations in ORF8 genes. In summary, we
postulated that such changes might be favoured host from severe effects of
those viral moderator proteins sustaining viral growth in same cells. On the
contrary, absence of those small transacting proteins favoured the clearance of
SARS-CoV-2 by host immune system generating mild infections.
Corona virus infections claimed >600000 lives in
two years recently and its genetic structure was known extensively due to
worldwide sequencing efforts [1]. The SARS-CoV-2 is a large positive-stranded
RNA virus with~30000 nucleotides genome and it was to MERS, SARS-CoV, CoV 229E
etc. related human corona viruses that were known for long time [2-4]. It has
structural proteins Membrane (M), Envelope (E), Nucleocapsid (N), Spike (S)
coded from 3’-1/3 part of the virus independently but RNA-dependent RNA
polymerase was coded from nsp12 domain of ORF1ab polyprotein from 2/3 of the
5’-part of the genome [5]. The ORF1ab polyprotein was degraded into sixteen
polypeptides (nsp1-nsp16) (Figure-1). The ORF1ab generated sixteen peptides
are: Nsp1(1-180aa), Nsp2(181-818aa), Nsp3(819-2763aa), Nsp4(2764-3263aa),
Nsp5(3264-3569aa), Nsp6(3570-3859aa), Nsp7(3860-3942aa), Nsp8(3943-4140aa),
Nsp9(4141-4253aa), Nsp10(4265-4392aa), Nsp11(4393-4400aa), Nsp12(4401-5324aa),
Nsp13(5325-5925aa), Nsp14(5926-6462aa), Nsp15(6453-6798aa) and
Nsp16(6799-7096aa). The nsp2 protein is RNA topoisomerase whereas Nsp3 and nsp5
are proteases and nsp12 is RNA-dependent RNA polymerase [6-9]. The nsp6, nsp7,
nsp8, nsp9 and nsp10 were small accessory proteins involved in RNA polymerase
replication complex [10-12]. The nsp14 and nsp15 are nucleases to degrade RNA
and nsp16 is 2’-O Uridine methyltransferase and as well as nsp13 is RNA
helicase with capping methyl transferase similarity [13-15]. Nsp11 is a small
peptide and function was not known. The ORF3a, ORF6, ORF7a, ORF7b, ORF8, ORF9
and ORF10 small proteins also coded from 3’ end of the SARS-CoV-2 genome and
have roles in regulating cellular genes [16-20]. Many drugs were discovered
against proteases and RNA polymerases but vaccines (specifically recombinant
spike vaccine) were only important remedy that halted the corona virus spread
[21,22]. The most frequent mutation that occurred in most corona virus isolates
was 3037C>T which is a synonymous change that usually accompanied 3 other
mutations that include 241C>T, 14408C>T (P323L in RdRp) and 23403A>G
(D614G in S-protein). The omicron corona virus (B.1.1.529) spike mutations
were: A67V (V67), T95I (I93), N211I (I206), L212V (V207), V215P (P210), R216E
(E211), G341D (D336), S373L (L368), S375P (P370), S377F (F372), K419N (N414), N442K
(K437), G448S (S443), S479N (N474), E486A (A481), Q495R (R490), G498S (S493),
Q500R (R495), Y507H (H502), T549K (K544), H657Y (Y652), P683H (H678), N766K
(K761), D798Y (Y793), N858K (K853), Q956H (H951), N971K (K966), and L983F
(F978) [in sate values for omicron virus positions [23-26]. Interestingly,
N501Y dominant mutation in B.1.1.7 was found in omicron BA.1, BA.4 and BA.5
including other related variants like BQ.1 and BF.7. The nsp1 protein is 180
amino acids and such protein has deleted in some corona virus strains [27].
Recent data suggested that Nsp1 protein could inhibit all cellular antiviral
defence mechanisms that would depend on the expression of host factors,
interferon-gamma and IL-6 [28-32]. It was found that amino acid residues K164
and H165 of Nsp1 from both SARS-CoV and SARS-CoV-2 were necessary for ribosome
interaction as revealed by Cryo–Electron Microscopy of in vitro–reconstituted
various Nsp1-40S and Nsp1-80S complexes.
The Nsp1 C-terminus bound to mRNA tunnel inhibiting mRNA entry and
protein synthesis blocking the retinoic acid inducible gene-I dependent innate
immune responses that would otherwise facilitate clearance of the infection
[33-36]. The SARS-CoV-2 escapes direct NK cell killing through Nsp1-mediated
downregulation of ligands for NKG2D [37]. The mRNA degradation function of nsp1
protein was reported [38,39]. Further, nsp1 is a potent translational inhibitor
[40,41]. The nsp1 protein also inhibits cellular mRNA synthesis and directs
viral protein synthesis [42-44]. The deletions hotspot in the nsp1 protein are
thus very interesting. We demonstrated in this article that 141KSF deletion in
nsp1 protein was occurred in mostly omicron BA.4 variants whereas some deletion
hotspot was located at 59 amino acids (AAs) upstream of 141KSF deletion site
which we called 82GHVMV locus where 2-5 AAs deletions were found in some
SARS-CoV-2 variants [45-47].
We searched PubMed to get idea on published papers on
nsp1protein (www.ncbi.nlm.nih.gov/pubmed). The SARS-CoV-2 sequences were down
loaded from SARS-CoV-2 database (NCBI, NIH, USA). We also searched NCBI BLAST
search using BLAST-N and BLAST-X search methods to get sequences [48].
Multi-alignment of protein was done by MultAlin software and multi-alignment of
DNA by CLUSTAL-Omega software, EMBL-EBI [49-51]. The ORF1ab mutants was
obtained by BlastN search of deletion boundary of 60-100nt sequence and then
analyzing the sequences with 95-100% similarities [52,53]. The other ORF1ab
mutants were detected by Blast-N search and Blast-X search with selected
deletion boundaries. Hairpin structure of ~ 120-200nt sequence was done by
OligoAnalyzer 3.1 software (Integrated DNA Technologies). The protein 3-D
structure was determined by SWISS-Model software with normal vs. mutant
peptides [54-58].
We made multi-alignment of coronavirus genomes to find
specific deletions in the ORF1ab genes and few oligonucleotides at the deletion
boundaries of 82GHVMV, 141KSF and 3675SGF
deletions of ORF1ab protein as shown in (Table 1). The KSF deletion oligo
(5’-tgg cca tag gta cgg cgc cga tct aga ctt agg cga cga gc ttg gca ctg a-3’),
GHVMV deletion oligo (5’-acg ttc gga tgc tcg aac tgc acc tca tga gct ggt agc
aga act cga agg cat t-3’) and SGF deletion oligo (5’-aat tac aga aga ggt tgg
cca tag ttt gaa gct aaa aga ctg tgt tat gta tgc atc ag-3’) gave very
informative on the ORF1ab deletion mutants (>5000 sequences) in the NCBI
database. The GHVMV-KSF oligo gave >995 sequences with both 82GHVMV
plus 141KSF deletion in the nsp1 protein. A 63nt deletion from nt.
27695-27768 at the junction of ORF7a gene 3’-end and ORF7b gene 5’-end was
found in accession no. OM766944. A 26nt deletion at the 3’-UTR (nt.
29733-29759) of SARS-CoV-2 genome was found (5’-gag gcc acg cgg agt acg atc gag
tg-3’) in different GHVMV mutants (accession numbers OP200462, BS004962,
OX271963, ON956441, OP258049, ON414598, and ON766944) (Figure 2). We BLAST-N
searched using SGF-1st and SGF-2nd oligos to trap 3675SGF deletion mutants and
10 sequences (five 1st SGF and five 2nd SGF) were aligned using NC_045512.2 as
standard. We found that SGF 1st and 2nd deletion oligos selected sequences had
all 3675SGF deletions (data not shown) but two sequences (acc. nos. OK040080
and OP591969) had 141KSF deletion whereas one (acc. no. OP827777) had 84VMV
three AA deletion instead 82GHVMV (Figure 3A). The ratio of SGF: KSF: GHVMV
deletions in ORF1ab protein estimated approximately 10:2:1. Isolated sequences
were mostly Omicron corona virus variants with 31ERS N-protein deletion except
accession numbers MZ223360 and OL369199, which had 69HV and 212L deletions but
no 31EPS insertion and designated as pre-omicron BA.1 variant. Surprisingly,
the sequence OK040080 had 31ERS deletion in N-protein and 141KSF deletion in
ORF1ab whereas no 24LPP or 69HV deletions in spike indicating it was either
BA.1/BA.2 or BA.4/BA.5 but omicron pre-BA.4. On the contrary, the sequence
OP591969 had 141KSF deletion in ORF1ab and 24LPP plus 69HV deletions in spike
and was omicron BA.4 variant (Figure 3B).
Table 1: Sequences of the
deletion boundary oligonucleotides.
GHVMV-KSF oligo
|
5’-cgttcggatgctcgaactgcacctcatgagctggtagcagaactcgaaggcatt cagtacggtcgtagtggtgagacacttggtgtccttgtccctcatgtgggcgaaat accagtggcttaccgcaaggttcttcttcgtaagaacggtaataaaggagctggtg gccataggtacggcgccgatctagacttaggcgacgagcttggcactgatcctt-3’ |
>1000 |
KSF oligo |
5’-tggccataggtacggcgccgatctagacttaggcgacgagcttggcactga-3’ |
>5000 |
GHVMV oligo |
5’-acgttcggatgctcgaactgcacctcatgagctggtagcagaactcgaaggcatt-3’ |
>5000 |
1st SGF oligo |
5’-GACATGGTTGATACTAGTTTGAAGCTAAAAGACTGTGTTATGTAT-3’ |
>250 |
2nd SGF oligo |
5’- GATATGGTTGATACTAGTTTGAAGCTAAAAGACTGTGTTATGTAT-3’ |
>10000 |
The sequences, OP619597, OP827932 and OP827059 had
24LPP and 69HV deletions but no 141KSF deletion in ORF1ab and were omicron BA.5
variants. The sequence ON999790 had 24LPP in spike but no 69HV deletion and was
omicron BA.2 variant. This data confirmed the heterogeneous population of
corona viruses in different 3675SGF deletion mutants which appeared early.
Multi-alignment of GHVMV oligo selected eight sequences demonstrated all recent
Omicron isolates with 3675SGF deletions but no 141KSF
deletion (Figure 4A). The 24LPP and 69HV spike deletions suggested all were
omicron corona virus (Figure 4B). Further, we aligned GHVMV-KSF deletion oligo
selected sixteen sequences to show such sequences had both 82GHVMV
plus 141KSF deletion in the nsp1 protein including 3675SGF (ORF1ab),
31ERS (N-protein), 24LPP (spike) and 26nt 3’UTR deletions (Figure 5A). Thus,
mostly were omicron BA.4 sub-variants with 141KSF-deletion in ORF1ab gene and
four were BA.2 variants (ON708747, ON800232, OW825883, and OW998170) those had
24LPP deletion but no 69HV deletion (Figure 5B). All sequences carried dominant
N501Y and D614G non-sense point mutation in the spike as well as P4517L
mutation in RdRp domain of ORF1ab polyprotein (data not shown). Interestingly,
the ORF7a gene hotspot deletion and mutation sites were affected in the few
GHVMV-KSF oligo selected Omicron corona viruses. Among them, a 112nt large
deletion detected in accession number OP791818 at nucleotide position 27547.
While 20nt deletion detected in accession number OP828357 at nucleotide 27546
and 12nt (5’-ttt act ctc caa-3’) deletion detected in accession number OX368044
at nucleotide 27679 while mere 3nt (5’-tac-3’) deletion detected in accession
number OX369387 at nucleotide 27681 (data not shown). We multi-aligned
different corona virus variants to demonstrate the spread of 82GHVMV, 141KSF
and 3675SGF deletions in different variants with time (Figure 6A). Most viruses
since early 2021 had 3675SGF deletion except most deadly Delta variant whereas 141KSF
nsp1 deletion appeared in early 2022 in omicron BA.4 variants (Figure 6B). We
performed the CLUSTAL-omega phylogenetic analysis to show relation among the
sixteen different corona virus lineages to confirm Delta variant was unique
sub-variant (Figure 6C). The COVID-19 B.1.617.2 and B.1.526 variants were
closely related having no 3675SGF deletion in nsp6 protein. Similarly, BF.7,
BK.1, BE.1.1 were related to BA.5 variant. Whereas B.1.1.7 (Alpha), B.1.351
(Beta) and B.1.1.28.1 (Gamma) were closely related with abundant 3675SGF
deletion as reported earlier. Further, to conclude that 3675SGF
deletion was not carried in Delta variant, we selected B.1.617.2 and AY.103
Delta corona virus sequences but did not detected any 82GHVMV, 141KSF
and 3675SGF deletions (Figure 7). However, KSF deletion only
happened in omicron BA.4 variants as we demonstrated earlier. Importantly, we
detected one amino acid deletion in the omicron BA.1 and BA.1.1 variants at
2083S of ORF1ab polyprotein as demonstrated in (Figure 8A). We multi-aligned
B.1.1.529, BA.1, BA.1.1, BA.1.1.2, BA.1.1.18 sub-variants sequences to conclude
that 2083S deletion was indeed BA.1 variant specific (Figure 8B). In figure-9,
we extensively showed the 141KSF deletion was associated with only
omicron BA.4 variants. To trap the early date 3675SGF deletion, we
aligned sequences from different countries to demonstrate that early 2021 was
date time for appearance of such deletion when no KSF deletion was found
(Figures 9-10). Further we showed that
the 2083Y deletion was found in omicron BA.1 variant only but not in
omicron BA.2, BA.4 and BA.5 sub-variants as well as other deadly variants like
Alpha, Beta and Delta variants (Figure 11). Further, we analysed the hairpin
structures of ~250nt sequences surroundings 82GHVMV and 141KSF
deletion sites. Demonstrated that special hairpin nob-like structure altered in
82GHVMV locus and a stiff long hairpin in 141KSF locus
also slightly changed (Figure 12). Such hairpin structures may explain the
reason of deletions involving recombination enzymes like RNA topoisomerase
(nsp2) or other cellular recombination enzymes. We used MEGAV.11 software to
align 14 nsp1 sequences and only one sequence was BA.4.1 variant with 141KSF
deletion (Figure-13). The Seq15 was GHVMV-KSF oligo selected sequence whereas
the Seq16 was GHVMV oligo selected sequence. A S135R mutation in nsp1 was found
in Omicron variants (BA.2, BE.1.1, BF.7, BA.2.75.2, BA.4.1 and BA.5) but not in
Alpha, Gamma or Delta variants. Interestingly, GHVMV oligo selected sequence
(accession no. ON972497) has no such mutation but with GHVMV-KSF oligo selected
one (accession no. OP200462) which was BA.4 variant. Model structure
(SWISS-MODEL) of normal nsp1 protein and GHVMV-KSF deletion mutant suggested a
profound change in 3-D residues although Ramachandran plot suggested 98.18%
favoured in normal nsp1 protein over 98.1% in GHVMV-KSF deletion mutant (Figure
14). The Clash Score increased in mutant from 0.00 to 1.77 whereas Mol Probity
Score change from 0.51 to 0.96 based on published nsp1 model structures (PDB: 7K3N
and 6ZMI). We showed in figure-14 how in mutant nsp1 protein Proline 80 residue
was hidden and Arginine 77 residue was protruded in deleted nsp1 protein.
Likely such changes may lower the binding efficiency of nsp1 protein to human
80S ribosome complex to inhibit host protein synthesis.
We clearly demonstrated that Delta corona virus
variant has no 82GHVMV, 141KSF, 2083Y and 3675SGF
deletions. Further we clearly showed that among the four deletions described,
the SGF deletion was appeared first in B.1.1.7 during early 2021. Similarly, 2083Y,
a single amino acid deletion was specific for Omicron BA.1 variant whereas KSF
deletion was specific for omicron BA.4 variant and both were appeared in early,
2022. While deletion in the GHVMV locus was limited and only appeared in
Omicron variants. The nsp1 is a hotspot of deletion and may be target drug
design. We have clearly demonstrated the deletions and dominant point mutations
in the ORF1ab gene that gave 7096 AA protein which on proteolysis produced 16
polypeptides (nsp1-nsp16) with diverse functions. In majority of corona virus
population, the most frequent and common mutation like T265I (C1059T) in nsp2
RNA topoisomerase, P323L(C17747T) in RdRp, D614G (A23403G) in spike, Q57H
(G25563T) in ORF3a and L84S (T28144C) in ORF8, were detected [59]. Khalid et al
reported the insertion of TTT at 11085 creating one extra amino acid (F) to the
NSP6 protein at amino acid position 38. The mutations and deletions were
ubiquitous but analysis of 20 or more sequences sometime might give erroneous
data and only desired portion of the multi-alignment data was presented [60].
The nsp6 protein has 7 putative trans-membrane helices and binds to TANK
binding kinase 1 (TBK1) and suppresses the phosphorylation of interferon
regulatory factor 3 (IRF3) thereby, lowering the Type I interferon response; to
evade host defences. The point mutations were also important in different
domains of ORF1ab polyproteins. The nsp13 RNA helicase-rRNA methyltransferase
P504L and Y541C mutations were documented in samples before April, 2020 [61].
Different five mutations; T265I in nsp2, T1246I in nsp3 protease, G3278S in
nsp5 proteinase, L3606F in nsp6 and P4715L in RdRp were found common in corona
viruses analysed from six geographical locations; Africa, Asia, Europe, North
America, Oceania and South America [24]. Other than SGF (3675-3677) deletion of
nsp6, the F3760 and MVD (3669-3671) deletions were also reported. A YHFRELGVV
(4738- 4746) deletion in the RNA dependent RNA polymerase or N389, GLNDNL
(445-450), V649, T770, C784 deletions in the RNA topoisomerase were reported by
same group [62]. Quite surprising 6 and 10 amino acids deletions were reported
in spike protein at 365 and 679 positions respectively (accession nos. MT621560
and MT370992 respectively). Thus, deletion and point mutation in most RNA
viruses were universal although we were unable to show such mutation in the RNA
polymerase enzyme except P4715L. Importantly, recent Omicron virus 24LPP
deletion in spike and 31ERS deletion in N-protein were very important in
regulating COVID-19 immune-function and replication. We do not know the
consequence of 26nt 3’-UTR deletion as we detected in many Omicron variants. It
assumed then that such deletions might be lowering the SARS-CoV-2 overall
pathogenicity. The Alpha variant N501Y mutation increased transmission and most
importantly D614G mutation found in all variants since March, 2020 which made
corona virus deadly. The Omicron corona virus 20-25 mutations over Wuhan corona
virus in the RBD domain of spike absolutely gave COVID-19 immune-escape
character and a repeated-infections even after 2-3 doses of vaccine intake were
reported worldwide. If the nsp1 82GHVMV and 141KSF deletions in the nsp1
protein in Omicron variants has any relation to spike 24LPP or 31ERS N-protein
deletions was not known [59]. Molecular modelling suggested that nsp1 deletions
might have negative impact of its trans-activator or moderator function with
host genes. Similarly, we do not know, why is the 2083Y deletion in nsp3
protease of Omicron corona virus was BA.1 variant specific? Fisher et al. reported that 3675SGF deletion
in nsp6 affected the virus replication machinery as reduced virus titre was
found [31]. It appeared that 3675SGF deletion was not granted in Delta variants
(AY.103, B.1.617.2) (figure-9 and figure-11). However, we found a popular
corona virus Delta variant characteristic of 157FR two amino acids deletion in
spike protein and 119DF deletion in the ORF8 protein (data not shown). The
3675GHVMV deletion in the nsp1 protein was found very limited with only few
hundred in the database and 141KSF deletion in the same protein was very much abundant
in Omicron BA.4. Variant and subvariants (figure-7 and figure-8). Sosnowski et
al. demonstrated that conserved key residues in the amino-terminal half of the
NSP1 protein were essential for evasion to the inhibitory effect of NSP1 on
translation [43,47]. Fisher et al demonstrated the multifunctional role of nsp1
to shut off cellular protein synthesis, to degrade mRNAs and to block cellular
interferon response [31]. We presumed a hairpin nob-like structures located at
the nsp1 locus regulated such deletions (figure-14). Further, model structure
clearly demonstrated the impact of such 8 AAs deletions in the nsp1 protein
changing its overall 3-D structure. Taken together, we demonstrated the
distribution of COVID-19 ORF1ab major deletions since December 2019 to December
2022 in different variant and sub variants which was never explored [63]. Most
vivid example was, such deletion was not detected in Corona virus Delta variant
which was impacted society in a horrible way with million deaths between May, 2021
to December, 2021. Surely, we have to explore the most recent BA.2.75, BA.4.6,
BA.5.2.1, BF.7 and BE.1.1 lineages if any new deletion to appear changing
epidemic spread of corona virus infections [64-65].
We thank CLUSTAL-Omega
software for free distribution and NCBI (USA) for free SARS-CoV-2 Database
usage worldwide. AKC is a retired professor of Biochemistry.
The author declares no
conflict of interest. This paper uses only computer-generated data analysis
using SARS-CoV-2 Database.