Giter Club home page Giter Club logo

superfolder-covid-mrna-vaccines's Introduction

SuperFolder COVID-19 mRNA vaccine

An mRNA for the SARS-CoV-2 Spike glycoprotein antigen (the prefusion stabilized variant, S-2P), stabilized against hydrolysis, for testing in COVID-19 mRNA vaccines.

This open source mRNA vaccine is being made available in the public domain to aid any efforts seeking to deliver COVID-19 mRNA vaccines in prefilled syringes without freezing.

The codons of this Superfolder S-2P mRNA have been redesigned to produce a highly stable mRNA structure predicted to have ~3x longer lifetime in (non-frozen) aqueous storage, based on computational design rules, algorithms, and a large corpus of experimental data developed through 2020 in the OpenVaccine project, based at Stanford University.

For avoidance of doubt, an Open Covid License is enclosed with this repository that grants license to any party for use of this sequence and enclosed data in research or product development that could accelerate the end of the COVID-19 pandemic.

The RNA sequence

> OpenVaccine Superfolder-v2 S-2P mRNA (Eterna 10646543, BugacMan's Deg-2_First_14nt_open)
AUGUUUGUUUUUUUAGUCCUCUUGCCGCUGGUUUCGAGUCAGUGCGUCAACCUUACGACGCGGACGCAGCUGCCGCCGGCCUACACGAACUCCUUCACGCGGGGGGUGUACUACCCCGACAAGGUGUUCCGCUCGUCUGUUCUGCACAGCACGCAGGACCUCUUCCUCCCGUUCUUCUCGAACGUGACGUGGUUCCACGCCAUUCACGUGUCGGGGACGAACGGGACGAAGAGGUUCGACAACCCUGUUCUGCCGUUCAACGACGGGGUGUACUUCGCUUCGACGGAGAAGUCCAACAUUAUUCGCGGGUGGAUAUUCGGGACCACUCUCGAUUCGAAGACUCAGUCCUUGCUGAUAGUGAACAACGCGACGAACGUGGUCAUUAAGGUCUGCGAGUUCCAGUUCUGUAAUGACCCGUUCCUCGGUGUUUACUAUCACAAGAACAACAAGUCUUGGAUGGAGAGUGAGUUCCGGGUGUACUCGUCCGCGAAUAAUUGUACGUUCGAGUAUGUGAGCCAGCCGUUCCUGAUGGAUCUUGAGGGCAAGCAGGGCAAUUUCAAGAAUUUGCGCGAGUUUGUCUUCAAGAACAUCGACGGCUACUUCAAGAUAUACUCGAAGCACACGCCGAUCAACCUCGUCCGUGAUCUCCCGCAGGGGUUCAGCGCGCUGGAGCCGCUGGUGGACCUGCCGAUCGGGAUCAACAUCACGCGGUUCCAGACGCUGCUGGCCCUGCACCGGAGUUACCUGACGCCGGGUGACUCCAGUUCCGGGUGGACUGCGGGUGCCGCCGCGUACUACGUGGGGUACCUGCAGCCGCGGACGUUCUUGUUGAAGUACAACGAGAACGGGACGAUCACGGACGCGGUUGAUUGCGCGUUGGACCCUCUGUCGGAGACGAAGUGCACCCUGAAGUCGUUCACGGUGGAGAAGGGGAUCUAUCAGACCUCCAACUUCCGGGUCCAGCCGACUGAGAGUAUCGUUCGGUUCCCGAACAUCACGAACCUGUGUCCGUUCGGGGAGGUCUUCAACGCGACGCGGUUCGCCUCCGUGUACGCUUGGAACCGGAAGAGGAUCUCGAAUUGUGUGGCGGACUACAGUGUGCUGUACAAUUCGGCGUCCUUUUCCACUUUCAAGUGUUACGGGGUGUCGCCCACGAAGUUGAACGACCUCUGCUUCACCAACGUGUACGCGGAUUCCUUCGUCAUCCGUGGGGACGAGGUCAGGCAGAUCGCGCCCGGGCAGACUGGGAAGAUAGCGGACUACAAUUAUAAGUUGCCCGACGACUUUACUGGCUGCGUUAUUGCUUGGAACAGCAAUAACCUGGACAGUAAGGUCGGGGGCAACUAUAAUUACCUGUACCGGCUGUUCCGGAAGAGCAAUCUGAAGCCCUUCGAGCGCGACAUUAGCACGGAGAUCUACCAGGCGGGUAGCACUCCGUGCAAUGGCGUGGAGGGCUUCAAUUGCUACUUUCCGCUGCAGUCGUACGGGUUUCAGCCCACCAACGGGGUGGGGUACCAGCCCUACCGCGUGGUGGUGCUGAGCUUCGAGCUCCUUCACGCGCCCGCGACGGUCUGUGGGCCCAAGAAGUCGACGAACUUAGUGAAGAACAAGUGCGUCAACUUCAACUUCAAUGGGCUCACGGGCACGGGGGUGCUGACGGAGUCGAACAAGAAGUUCCUGCCUUUCCAGCAGUUCGGGCGCGAUAUUGCCGACACCACGGAUGCCGUGAGGGAUCCGCAGACGCUUGAGAUUCUGGACAUCACGCCCUGCAGCUUCGGGGGCGUCAGUGUGAUCACGCCUGGUACGAACACGAGCAACCAGGUUGCCGUGUUGUACCAGGACGUGAAUUGCACUGAGGUCCCCGUAGCGAUCCACGCGGAUCAGCUGACCCCGACGUGGAGGGUGUACUCGACGGGGAGUAAUGUAUUUCAGACGCGGGCUGGUUGUCUGAUUGGUGCGGAGCACGUAAACAACUCCUAUGAGUGUGACAUACCCAUAGGAGCUGGCAUAUGUGCUUCGUACCAGACUCAGACAAACAGCCCGCGUCGAGCCCGGAGCGUUGCGUCGCAGAGCAUAAUCGCGUACACGAUGUCCCUCGGGGCGGAGAAUUCGGUGGCAUAUAGUAACAACAGUAUUGCCAUACCGACGAACUUCACGAUCUCCGUGACCACCGAGAUACUGCCGGUGAGCAUGACUAAGACGAGUGUAGACUGUACGAUGUAUAUCUGCGGCGACAGUACUGAGUGCAGUAACCUGUUGCUGCAGUAUGGGUCGUUCUGCACUCAGCUUAAUCGUGCUCUUACCGGGAUCGCCGUAGAGCAGGAUAAGAACACGCAGGAGGUCUUUGCGCAGGUGAAGCAGAUCUACAAGACUCCGCCUAUCAAGGACUUCGGCGGGUUCAACUUCAGCCAGAUUCUGCCAGACCCAUCUAAGCCGAGCAAGAGGUCCUUUAUUGAGGACCUCUUGUUCAACAAGGUGACUCUGGCAGAUGCUGGCUUUAUCAAGCAGUACGGCGAUUGUCUCGGGGACAUCGCUGCGCGCGAUUUGAUCUGUGCGCAAAAGUUCAACGGGCUCACUGUGCUACCACCUCUCCUGACGGACGAGAUGAUAGCACAGUAUACCAGCGCGCUGUUGGCUGGUACGAUCACUUCUGGUUGGACGUUCGGGGCGGGUGCGGCACUCCAGAUCCCGUUCGCCAUGCAGAUGGCGUACCGGUUCAACGGGAUCGGAGUGACGCAGAACGUCCUGUACGAAAACCAGAAGUUGAUCGCCAACCAGUUCAACAGCGCGAUUGGUAAGAUACAGGACUCCCUGUCGAGUACGGCCUCCGCGUUGGGGAAGCUGCAGGACGUGGUGAACCAGAAUGCUCAGGCGUUGAACACGUUGGUGAAGCAGCUGUCGUCCAACUUCGGGGCGAUAAGUUCGGUGCUGAACGAUAUUCUCAGUCGGCUGGACCCGCCGGAGGCGGAGGUCCAGAUAGAUCGGCUCAUCACUGGUCGCCUCCAGAGUUUGCAGACGUACGUGACUCAGCAGCUCAUCCGGGCUGCUGAGAUACGUGCGUCUGCGAACCUGGCGGCGACCAAGAUGAGCGAGUGCGUGCUGGGGCAGAGCAAGCGGGUGGACUUCUGCGGGAAGGGGUAUCACCUGAUGUCCUUCCCGCAGAGCGCGCCUCACGGGGUGGUCUUCCUGCACGUGACGUAUGUGCCGGCGCAGGAGAAGAACUUCACCACGGCGCCGGCGAUAUGUCACGACGGGAAGGCGCACUUCCCGCGUGAGGGGGUUUUUGUUUCGAACGGGACGCACUGGUUCGUCACGCAGCGCAACUUCUAUGAGCCGCAGAUAAUUACCACUGACAACACGUUUGUCAGUGGUAAUUGUGAUGUGGUCAUAGGGAUCGUGAACAACACGGUCUACGAUCCCCUGCAGCCCGAGCUGGAUAGUUUCAAGGAGGAGCUUGAUAAGUACUUCAAGAAUCAUACUUCCCCGGACGUGGAUCUUGGCGACAUUAGCGGGAUCAACGCUAGUGUCGUCAACAUCCAGAAGGAGAUCGACAGGCUCAAUGAGGUUGCGAAGAACCUCAACGAGAGCCUGAUCGAUCUCCAGGAGUUGGGGAAGUAUGAGCAGUACAUCAAGUGGCCUUGGUACAUCUGGCUCGGGUUCAUCGCGGGGCUGAUCGCGAUCGUGAUGGUCACGAUCAUGUUGUGCUGCAUGACGAGCUGCUGCUCCUGUUUGAAGGGCUGCUGCAGCUGCGGUUCGUGUUGUAAGUUUGACGAGGAUGACUCGGAGCCGGUGCUCAAGGGGGUGAAGCUGCACUACACGUGA

The RNA above also includes a stop codon. Feel free to copy/paste this sequence and use directly in your research or product development.

The above RNA has also been modified to code for the spike protein of strain B.1.351.

> OpenVaccine Superfolder-v2 S-2P B.1.351 mRNA 
AUGUUCGUGUUCCUCGUGCUCCUUCCGCUGGUCUCGAGCCAGUGCGUCAAUUUCACGACGAGGACGCAGUUGCCCCCCGCGUACACGAACUCGUUUACGCGGGGGGUGUACUACCCGGACAAGGUCUUCCGCAGCUCUGUCCUGCACAGCACUCAGGACCUCUUCCUCCCGUUCUUCUCGAACGUGACGUGGUUCCACGCCAUUCACGUGUCGGGGACGAACGGGACGAAGAGGUUCGCGAACCCUGUUCUGCCGUUCAACGACGGGGUGUACUUCGCUUCGACGGAGAAGUCCAACAUUAUUCGCGGGUGGAUAUUCGGGACCACUCUCGAUUCGAAGACUCAGUCCUUGCUGAUAGUGAACAACGCCACGAACGUGGUCAUUAAGGUCUGCGAGUUCCAGUUCUGUAAUGACCCGUUCCUGGGUGUUUACUAUCACAAGAACAACAAGUCUUGGAUGGAGAGUGAGUUCCGGGUGUAUUCGUCCGCGAAUAAUUGUACCUUCGAGUAUGUCUCGCAGCCAUUCUUGAUGGAUCUUGAGGGCAAGCAGGGAAAUUUCAAGAAUCUCCGCGAGUUUGUCUUCAAGAACAUCGACGGGUACUUCAAGAUAUACUCGAAGCACACGCCGAUCAACCUCGUCCGUGGGCUCCCGCAGGGCUUCAGCGCUCUGGAGCCGCUGGUGGAUCUCCCGAUCGGGAUCAACAUCACGCGGUUCCAGACGCUGCACAUCAGUUACCUGACGCCGGGUGACUCCAGUAGUGGGUGGACUGCGGGUGCCGCGGCGUACUACGUCGGGUACCUGCAGCCGCGCACGUUCUUGUUGAAGUACAACGAGAACGGGACGAUCACGGACGCGGUUGAUUGCGCGUUGGACCCUCUGUCGGAGACGAAGUGCACCCUGAAGUCGUUCACGGUGGAGAAGGGUAUCUAUCAGACCUCGAACUUCCGGGUCCAGCCGACUGAGAGUAUCGUUCGGUUCCCGAACAUUACGAACCUGUGUCCGUUCGGGGAGGUCUUCAACGCGACGCGGUUCGCGAGUGUGUACGCUUGGAACCGGAAGAGGAUCUCGAAUUGUGUGGCGGACUACAGUGUGCUGUACAAUUCGGCGUCCUUUUCCACGUUCAAGUGCUACGGGGUGUCGCCCACGAAGUUGAACGACCUCUGCUUCACCAACGUGUAUGCGGAUUCCUUCGUCAUCCGUGGUGACGAGGUGCGUCAGAUUGCGCCGGGGCAGACGGGGAACAUAGCGGACUAUAAUUAUAAGUUGCCCGACGACUUUACUGGCUGCGUUAUUGCUUGGAACAGCAAUAACCUGGACAGUAAGGUCGGGGGCAACUAUAAUUAUCUUUACCGUCUGUUCCGGAAGAGCAAUCUGAAGCCCUUCGAGCGCGAUAUCUCGACCGAGAUCUACCAGGCCGGCUCGACGCCGUGCAACGGCGUCAAGGGGUUUAAUUGUUACUUUCCGUUACAGAGCUACGGGUUUCAGCCCACGUACGGGGUGGGGUACCAGCCCUACCGCGUCGUGGUGCUGAGCUUCGAGCUGCUGCACGCCCCGGCCACGGUGUGCGGUCCGAAGAAAAGUACAAACCUUGUGAAGAACAAGUGUGUGAACUUUAACUUCAACGGGCUCACCGGGACGGGGGUGUUGACGGAGAGUAACAAGAAGUUCCUGCCGUUCCAGCAGUUCGGUCGGGAUAUCGCGGACACCACGGAUGCCGUGAGGGAUCCGCAGACGCUUGAGAUUCUGGACAUCACGCCCUGCAGCUUCGGGGGCGUCAGUGUGAUCACGCCUGGUACGAACACCAGCAACCAGGUUGCGGUGUUGUACCAGGGCGUGAAUUGCACUGAGGUCCCCGUAGCGAUCCACGCGGAUCAGCUGACCCCGACGUGGAGGGUGUACUCGACGGGGAGUAAUGUCUUCCAGACUCGCGCGGGUUGCCUGAUUGGCGCUGAGCACGUGAACAACUCGUACGAGUGCGACAUUCCCAUUGGGGCGGGGAUCUGCGCGUCGUACCAGACCCAGACGAACAGCCCGAGGCGGGCGAGGAGCGUCGCGUCGCAGUCGAUCAUCGCGUACACGAUGAGCCUGGGGGUGGAGAACAGUGUGGCCUAUUCGAACAACAGCAUAGCUAUCCCCACGAAUUUUACGAUCAGUGUGACGACCGAGAUCUUGCCCGUGUCGAUGACCAAGACCUCGGUCGAUUGCACGAUGUACAUUUGUGGGGAUAGCACUGAGUGUUCUAACCUCCUGCUCCAGUACGGCAGUUUCUGUACGCAGCUCAACCGGGCGCUUACGGGGAUUGCCGUGGAGCAGGACAAGAACACUCAGGAGGUGUUUGCGCAGGUCAAGCAGAUCUACAAGACGCCUCCGAUCAAGGAUUUCGGGGGGUUCAAUUUCUCCCAGAUACUCCCCGACCCUUCGAAGCCCAGCAAGCGUAGCUUCAUUGAGGACCUGCUCUUCAAUAAGGUUACGCUUGCGGACGCGGGCUUCAUCAAGCAGUACGGGGACUGUCUGGGGGACAUUGCCGCCCGGGACCUGAUCUGUGCUCAGAAGUUCAAUGGGCUCACUGUUCUGCCGCCCCUGCUCACGGACGAGAUGAUCGCGCAGUACACGUCGGCGCUCCUCGCCGGCACGAUCACGUCGGGCUGGACGUUUGGGGCUGGUGCGGCGCUGCAGAUCCCGUUCGCCAUGCAGAUGGCGUACCGCUUCAAUGGGAUCGGGGUGACCCAGAAUGUCCUGUACGAGAAUCAGAAGCUCAUCGCCAAUCAGUUCAACUCGGCGAUCGGGAAGAUACAGGACUCCCUGUCGAGUACGGCCUCCGCGUUGGGGAAGCUGCAGGACGUGGUGAACCAGAAUGCUCAGGCGUUGAACACGUUGGUGAAGCAGCUGUCGUCCAACUUCGGGGCGAUAUCCUCGGUGCUGAACGAUAUUCUCAGUCGGCUGGACCCGCCGGAGGCGGAGGUUCAGAUCGAUAGACUCAUCACUGGUCGCCUCCAGAGUUUGCAGACGUACGUGACUCAGCAGCUCAUCCGGGCUGCUGAGAUACGUGCGUCUGCGAACCUGGCGGCGACCAAGAUGAGUGAGUGCGUGCUGGGGCAGAGCAAGCGGGUGGACUUUUGCGGGAAGGGCUAUCACCUGAUGUCCUUCCCGCAGUCCGCCCCUCACGGGGUGGUCUUCCUGCACGUGACGUAUGUGCCGGCGCAGGAGAAGAACUUCACCACGGCGCCGGCCAUAUGUCACGACGGGAAGGCCCACUUCCCCCGUGAGGGGGUCUUCGUGUCGAAUGGGACGCACUGGUUCGUGACGCAGCGGAAUUUCUAUGAGCCGCAGAUAAUUACGACUGACAACACGUUUGUCAGUGGUAAUUGUGAUGUGGUCAUAGGGAUUGUUAACAACACCGUGUAUGAUCCCCUCCAGCCGGAGCUGGACAGCUUCAAGGAGGAGCUGGAUAAGUACUUCAAGAAUCACACGUCGCCGGACGUGGAUCUUGGGGACAUAUCGGGGAUCAACGCGAGUGUUGUUAACAUACAGAAGGAGAUCGACCGGCUCAAUGAGGUUGCGAAGAACCUCAAUGAGUCGUUGAUCGACCUUCAGGAGCUCGGCAAGUAUGAGCAGUACAUCAAGUGGCCUUGGUACAUCUGGCUCGGGUUUAUAGCGGGGCUGAUCGCCAUCGUGAUGGUGACGAUCAUGCUCUGCUGUAUGACGUCGUGCUGCAGCUGCCUCAAGGGCUGCUGCUCUUGCGGCAGCUGUUGCAAGUUCGACGAGGACGACUCCGAGCCUGUGCUGAAGGGGGUGAAGCUGCAUUAUACUUGA

Here is a visualization of the secondary structure of the mRNA compared to the secondary structure of a mRNA designed through codon optimization:

Secondary structure of standard mRNA and superfolder

ViennaRNA MFE predicted secondary structure, visualized in RiboGraphViz, colored by DegScore (OpenVaccine consortium, 2021, in prep).

What it codes for

The below exact amino acid sequence is the basis for most mRNA vaccines that are undergoing clinical trials at the time of writing.

> S-2P SARS-CoV-2 Spike glycoprotein antigen, prefusion stabilized double proline variant. 1273 amino acids.
MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDPPEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT

For ongoing efforts to create new vaccines for new variants, we are also providing a sequence that has been adapted to code for the spike protein of strain B.1.351.

> S-2P SARS-CoV-2 Spike glycoprotein antigen, prefusion stabilized double proline variant, mutated to strain B.1.351.  1270 amino acids.
MFVFLVLLPLVSSQCVNFTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFANPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRGLPQGFSALEPLVDLPIGINITRFQTLHISYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVKGFNCYFPLQSYGFQPTYGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGVENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDPPEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT

Where it comes from

The sequence was designed through the following steps, which have been validated through extensive experiments on model mRNA's by the Das and Barna labs at Stanford University (OpenVaccine consortium, 2021, in prep).

  1. A starting sequence predicted to have highly negative (stable) deltaG was produced with the LinearDesign algorithm available at this server (Zhang, 2020).
  2. The sequence was further optimized with the Ribotree-mRNA Monte Carlo Tree Search algorithm which is available at the Eterna software site.
  3. The Ribotree-mRNA runs optimized DegScore, a predictor of overall hydrolysis rate trained on a vast data set of empirical measurements acquired during the Eterna roll-your-own sequence challenges (OpenVaccine consortium, 2021, in prep). Training and test data are available at Kaggle.
  4. Ribotree-mRNA runs were guided by EternaFold, currently the most accurate folding engine for predicting RNA structure properties (Wayment-Steele, 2020a).
  5. Ribotree-mRNA runs also favored opening of the first 14 nts to ensure a good binding site for the eukaryotic ribosome (Kozak, 1990; OpenVaccine consortium, in prep).
  6. For the Superfolder-v2 sequences above, participants of the Eterna project rationally altered the sequence to reduce its predicted degradation in solution. A sequence that did not undergo Eterna optimization (Superfolder-v1) is available in sequences/superfolder_v1.fasta.
  7. The sequence was screened for robustness of structure and low predicted hydrolysis in the context of numerous combinations of 5' UTR's, 3' UTR's, and poly(A) motifs that are currently in use for COVID-19 mRNA vaccines (see, e.g., Orlandini von Niessen et al., 2008).
  8. The sequence was designed based on simulations with standard nucleosides A, C, G, and U. The Stanford OpenVaccine experimental team has evidence that mRNAs stabilized with these standard nucleotides remain stabilized with substitutions such as pseudouridine and its analogs substituting for U (OpenVaccine consortium, 2021, in prep).
  9. To prepare the B.1.351 variant, we enumerated codons with the highest GC content for mutant amino acids and selected the candidate with the lowest DegScore.

Find more details and a comparison to other design methods here.

References

Kozak M. Downstream secondary structure facilitates recognition of initiator codons by eukaryotic ribosomes (1990). Proc Natl Acad Sci U S A. 87(21):8301-5. doi: 10.1073/pnas.87.21.8301

OpenVaccine Consortium (2021). Comparative optimization of messenger RNA structure, stability and expression for RNA therapeutics (in preparation).

Orlandini von Niessen AG, Poleganov MA, Rechner C, Plaschke A, Kranz LM, Fesser S, Diken M, Löwer M, Vallazza B, Beissert T, Bukur V, Kuhn AN, Türeci Ö, Sahin U. (2018), Improving mRNA-Based Therapeutic Gene Delivery by Expression-Augmenting 3' UTRs Identified by Cellular Library Screening. Mol Ther. 27(4):824-836. doi: 10.1016/j.ymthe.2018.12.011

Wayment-Steele, H.K., Kim, D.S., Choe, C.A., Nicol, J.J., Wellington-Oguri, R., Sperberg, R.A.P., Huang, P., Eterna Participants, Das, R. (2020). Theoretical basis for stabilizing messenger RNA through secondary structure design. bioRxiv, 262931. doi:10.1101/2020.08.22.262931

Wayment-Steele, H.K., Kladwang, W., Eterna Participants, Das, R. (2020). RNA secondary structure packages ranked and improved by high-throughput experiments. bioRxiv, 124511. 10.1101/2020.05.29.124511

Zhang, H., Zhang, L., Li, Z., Liu, K., Liu, B., Mathews, D. H., & Huang, L. (2020). LinearDesign: Efficient Algorithms for Optimized mRNA Sequence Design. arXiv preprint arXiv:2004.10177

Questions

For answers to any additional questions that might help accelerate the end of the COVID-19 pandemic, please contact Rhiju Das, Stanford University, [email protected].

superfolder-covid-mrna-vaccines's People

Contributors

alenkran avatar hwaymentsteele avatar rhiju avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.