We are curating a list of bioinformatics tools designed explicitly for SARS-CoV-2 and coronaviruses. We cover workflows and tools for

  • the routine detection of SARS-CoV-2 infection,
  • the reliable analysis of sequencing data,
  • the tracking of the COVID-19 pandemic and evaluation of containment measures,
  • the study of coronavirus evolution,
  • the discovery of potential drug targets and development of therapeutic strategies.

All tools are freely available online, either through web applications or public code repositories.

For a detailed description of the tools, check our review preprint:

  • [DOI] F. Hufsky, K. Lamkiewicz, A. Almeida, A. Aouacheria, C. Arighi, A. Bateman, J. Baumbach, N. Beerenwinkel, C. Brandt, M. Cacciabue, S. Chuguransky, O. Drechsel, R. D. Finn, A. Fritz, S. Fuchs, G. Hattab, A. Hauschild, D. Heider, M. Hoffmann, M. Hölzer, S. Hoops, L. Kaderali, I. Kalvari, M. von Kleist, R. Kmiecinski, D. Kühnert, G. Lasso, P. Libin, M. List, H. F. Löchel, M. J. Martin, R. Martin, J. Matschinske, A. C. McHardy, P. Mendes, J. Mistry, V. Navratil, E. P. Nawrocki, Á. N. O’Toole, N. Ontiveros-Palacios, A. I. Petrov, G. Rangel-Pineros, N. Redaschi, S. Reimering, K. Reinert, A. Reyes, L. Richardson, D. L. Robertson, S. Sadegh, J. B. Singer, K. Theys, C. Upton, M. Welzel, L. Williams, and M. Marz, “Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research,” Briefings Bioinf, 2020.
    author = {Franziska Hufsky and Kevin Lamkiewicz and Alexandre Almeida and Abdel Aouacheria and Cecilia Arighi and Alex Bateman and Jan Baumbach and Niko Beerenwinkel and Christian Brandt and Marco Cacciabue and Sara Chuguransky and Oliver Drechsel and Robert D Finn and Adrian Fritz and Stephan Fuchs and Georges Hattab and Anne-Christin Hauschild and Dominik Heider and Marie Hoffmann and Martin Hölzer and Stefan Hoops and Lars Kaderali and Ioanna Kalvari and Max von Kleist and Ren{\'{o}} Kmiecinski and Denise Kühnert and Gorka Lasso and Pieter Libin and Markus List and Hannah F Löchel and Maria J Martin and Roman Martin and Julian Matschinske and Alice C McHardy and Pedro Mendes and Jaina Mistry and Vincent Navratil and Eric P Nawrocki and {\'{A}}ine Niamh O'Toole and Nancy Ontiveros-Palacios and Anton I Petrov and Guillermo Rangel-Pineros and Nicole Redaschi and Susanne Reimering and Knut Reinert and Alejandro Reyes and Lorna Richardson and David L Robertson and Sepideh Sadegh and Joshua B Singer and Kristof Theys and Chris Upton and Marius Welzel and Lowri Williams and Manja Marz},
    title = {Computational strategies to combat {COVID}-19: useful tools to accelerate {SARS}-{CoV}-2 and coronavirus research},
    journal = {{Briefings Bioinf}},
    year = {2020},
    doi = {10.1093/bib/bbaa232},
    publisher = {Oxford University Press ({OUP})},

We are curious to know which tools you have developed to advance the field. Please also have a look at our general list of virus bioinformatics tools. Tools by EVBC members are marked ★.

Data resources
★ European COVID-19 Data Portal

The European COVID-19 Data Portal was launched by EMBL-EBI in conjunction with the European Commission, the European Open Science Cloud and ELIXIR, as part of the wider European COVID-19 Data Platform. The Data Portal brings together and continuously updates the relevant COVID-19 datasets and tools. This is a fast-paced project with the COVID-19 Data Portal featuring relevant datasets from EMBL-EBI Data Resources including the European Nucleotide Archive (ENA), UniProt, Protein Data Bank in Europe (PDBe), the Electron Microscopy Data Bank (EMDB), Expression Atlas, ChEMBL, and Europe PMC


The GISAID Initiative promotes the rapid sharing of data from the coronavirus causing COVID-19. This includes genetic sequence and related clinical and epidemiological data to help researchers understand how viruses evolve and spread during epidemics and pandemics.

Since the start of the COVID-19 outbreak and the identification of the pandemic virus, laboratories around the world are generating viral genome sequence data with unprecedented speed, enabling real-time progress in the understanding of the new disease and in the research and development of candidate medical countermeasures. Sequence data are essential to design and evaluate diagnostic tests, to track and trace the ongoing outbreak, and to identify potential intervention options.

GISAID data Submitters and Curators ensure real-time data sharing of hCoV-19 remains reliable, to enable rapid progress in the understanding of the new COVID-19 disease and in the research and development of candidate medical countermeasures.

★ ViPR SARS-CoV-2 data portal | Virus Pathogen Resource

The ViPR database integrates various types of data for multiple virus families. You can search the comprehensive database for sequences & strains, immune epitopes, 3D protein structures, host factor data, antiviral drugs, plasmid data. Further you can analyze the data online using sequence alignment, phylogenetic tree reconstruction, sequence variation (SNP), metadata-driven comparative analysis and BLAST.

Visit the SARS-CoV-2 data portal in ViPR.

  • [DOI] B. E. Pickett, E. L. Sadat, Y. Zhang, J. M. Noronha, B. R. Squires, V. Hunt, M. Liu, S. Kumar, S. Zaremba, Z. Gu, L. Zhou, C. N. Larson, J. Dietrich, E. B. Klem, and R. H. Scheuermann, “ViPR: an open bioinformatics database and analysis resource for virology research,” Nucleic Acids Res, vol. 40, iss. D1, p. D593–D598, 2011.
    author = {Brett E. Pickett and Eva L. Sadat and Yun Zhang and Jyothi M. Noronha and R. Burke Squires and Victoria Hunt and Mengya Liu and Sanjeev Kumar and Sam Zaremba and Zhiping Gu and Liwei Zhou and Christopher N. Larson and Jonathan Dietrich and Edward B. Klem and Richard H. Scheuermann},
    title = {{ViPR}: an open bioinformatics database and analysis resource for virology research},
    journal = {{Nucleic Acids Res}},
    year = {2011},
    volume = {40},
    number = {D1},
    pages = {D593--D598},
    doi = {10.1093/nar/gkr859},
    publisher = {Oxford University Press ({OUP})},
★ ViralZone coronavirus resource

ViralZone is a web-resource from the Swiss Institute of Bioinformatics for all viral genus and families, providing general molecular and epidemiological information, along with virion and genome figures. Each virus or family page gives an easy access to UniProtKB/Swiss-Prot viral protein entries. ViralZone project is handled by the virus program of SwissProt group.

  • [DOI] C. Hulo, E. de Castro, P. Masson, L. Bougueleret, A. Bairoch, I. Xenarios, and P. Le Mercier, “ViralZone: a knowledge resource to understand virus diversity.,” Nucleic Acids Res, vol. 39, p. D576–D582, 2011.
    author = {Hulo, Chantal and de Castro, Edouard and Masson, Patrick and Bougueleret, Lydie and Bairoch, Amos and Xenarios, Ioannis and Le Mercier, Philippe},
    title = {{ViralZone}: a knowledge resource to understand virus diversity.},
    journal = {{Nucleic Acids Res}},
    year = {2011},
    volume = {39},
    pages = {D576--D582},
    abstract = {The molecular diversity of viruses complicates the interpretation of viral genomic and proteomic data. To make sense of viral gene functions, investigators must be familiar with the virus host range, replication cycle and virion structure. Our aim is to provide a comprehensive resource bridging together textbook knowledge with genomic and proteomic sequences. ViralZone web resource (www.expasy.org/viralzone/) provides fact sheets on all known virus families/genera with easy access to sequence data. A selection of reference strains (RefStrain) provides annotated standards to circumvent the exponential increase of virus sequences. Moreover ViralZone offers a complete set of detailed and accurate virion pictures.},
    doi = {10.1093/nar/gkq901},
    issue = {Database issue},
    keywords = {Databases, Genetic; Genome, Viral; Genomics; Proteomics; Viral Proteins, genetics; Virion, chemistry; Virus Physiological Phenomena; Virus Replication; Viruses, classification, genetics},
    pmid = {20947564},
COVIDep | Real-time reporting of vaccine target recommendations for SARS-CoV-2
COVIDep provides an up-to-date set of B-cell and T-cell epitopes that can serve as potential vaccine targets for SARS-CoV-2. The identified epitopes are experimentally-derived from the 2003 SARS virus and have a close genetic match with the available SARS-CoV-2 sequences. COVIDep is flexible and user-friendly, comprising an intuitive graphical interface and interactive visualizations.

COVID-19 Disease Map
COVID-19 Disease Map is a knowledge repository of molecular mechanisms of COVID-19 established by broad community-driven effort. The COVID-19 Disease Map is an assembly of molecular interaction diagrams, established based on literature evidence.

  • [DOI] M. Ostaszewski, A. Mazein, M. E. Gillespie, I. Kuperstein, A. Niarakis, H. Hermjakob, A. R. Pico, E. L. Willighagen, C. T. Evelo, J. Hasenauer, F. Schreiber, A. Dräger, E. Demir, O. Wolkenhauer, L. I. Furlong, E. Barillot, J. Dopazo, A. Orta-Resendiz, F. Messina, A. Valencia, A. Funahashi, H. Kitano, C. Auffray, R. Balling, and R. Schneider, “COVID-19 disease map, building a computational repository of SARS-CoV-2 virus-host interaction mechanisms,” Sci Data, vol. 7, iss. 1, 2020.
    author = {Marek Ostaszewski and Alexander Mazein and Marc E. Gillespie and Inna Kuperstein and Anna Niarakis and Henning Hermjakob and Alexander R. Pico and Egon L. Willighagen and Chris T. Evelo and Jan Hasenauer and Falk Schreiber and Andreas Dräger and Emek Demir and Olaf Wolkenhauer and Laura I. Furlong and Emmanuel Barillot and Joaquin Dopazo and Aurelio Orta-Resendiz and Francesco Messina and Alfonso Valencia and Akira Funahashi and Hiroaki Kitano and Charles Auffray and Rudi Balling and Reinhard Schneider},
    title = {{COVID}-19 Disease Map, building a computational repository of {SARS}-{CoV}-2 virus-host interaction mechanisms},
    journal = {{Sci Data}},
    year = {2020},
    volume = {7},
    number = {1},
    doi = {10.1038/s41597-020-0477-8},
    publisher = {Springer Science and Business Media {LLC}},
DisGeNET COVID-19 data collection | Database of gene-disease associations

DisGeNET is a discovery platform containing one of the largest publicly available collections of genes and variants associated with human diseases.

DisGeNET COVID-19 data collection shows the results of applying text mining tools to the LitCovid dataset, to identify mentions of diseases, signs and symptoms. The LitCovid dataset contains a selection of papers referring to Coronavirus 19 disease.

  • [DOI] J. Piñero, J. M. Ramírez-Anguita, J. Saüch-Pitarch, F. Ronzano, E. Centeno, F. Sanz, and L. I. Furlong, “The DisGeNET knowledge platform for disease genomics: 2019 update,” Nucleic Acids Res, 2019.
    author = {Janet Pi{\~{n}}ero and Juan Manuel Ram{\'{\i}}rez-Anguita and Josep Saüch-Pitarch and Francesco Ronzano and Emilio Centeno and Ferran Sanz and Laura I Furlong},
    title = {The {DisGeNET} knowledge platform for disease genomics: 2019 update},
    journal = {{Nucleic Acids Res}},
    year = {2019},
    doi = {10.1093/nar/gkz1021},
    publisher = {Oxford University Press ({OUP})},
Detection, sequencing and annotation
★ PriSeT | Efficient De Novo Primer Discovery

Appropriate PCR primer pairs for DNA metabarcoding would match to a broad evolutionary range of taxa, such that we only need a few to achieve high taxonomic coverage. At the same time, the DNA barcodes between primer pairs should be different to allow us to distinguish between species to improve resolution. PriSeT finds a primer set P balancing both: high taxonomic coverage and high resolution. It is capable of processing large libraries and is robust against mislabeled or low quality references. It tackles the computationally expensive steps with linear runtime filters and efficient encodings.

PriSeT has been applied to 19 SARS-CoV-2 genomes and computed 114 new primer pairs with the additional constraint that the sequences have no co-occurrences in other taxa. These primer sets would be suitable for empirical testing.

  • [DOI] M. Hoffmann, M. T. Monaghan, and K. Reinert, “PriSeT: efficient de novo primer discovery,” bioRxiv, 2020.
    author = {Marie Hoffmann and Michael T. Monaghan and Knut Reinert},
    title = {{PriSeT}: Efficient De Novo Primer Discovery},
    journal = {{bioRxiv}},
    year = {2020},
    doi = {10.1101/2020.04.06.027961},
    publisher = {Cold Spring Harbor Laboratory},
★ CoVPipe | Amplicon-based genome reconstruction of SARS-CoV-2 genomes

CoVPipe is a highly optimized and fully automated workflow for the reference-based reconstruction of SARS-CoV-2 genomes based on next generation amplicon sequencing data using CleanPlex SARS-CoV-2 Panel (Paragon Genomics, Hayward, CA, USA) from swab samples.

The pipeline is designed for reproducibility and scalability in order to ensure reliable and fast data analysis of SARS-CoV2 data.

The nanopore workflow poreCov carries out all necessary steps from basecalling to assembly depending on the user input, followed by lineage prediction of each genome using Pangolin. Furthermore, read coverage plots are provided for each genome to assess the amplification quality of the multiplex PCR. In addition, poreCov includes a quick time tree-based analysis of the inputs against reference sequences. poreCov is implemented in nextflow for full parallelization of the workload and stable sample processing.
★ Ensembl COVID-19 resource | SARS-CoV-2 genome browser and related resources

SARS-CoV-2 genome browser and related resources, including the following data:

  • Gene annotation of the reference genome,
  • information on gene functions from Gene Ontology,
  • variation data from Nextstrain,
  • protein and genomic features,
  • alignments of Rfam covariance models,

Like for all Ensembl data, no restrictions on the use of the COVID-19 resources are placed.

VADR | validation and annotation of virus sequence submissions to GenBank

VADR (Viral Annotation DefineR) validates and annotates viral sequences in GenBank submissions. The annotation system is based on the analysis of the input nucleotide sequence using models built from curated RefSeqs. Hidden Markov models are used to classify sequences by determining the RefSeq they are most similar to, and feature annotation from the RefSeq is mapped based on a nucleotide alignment of the full sequence to a covariance model. Predicted proteins encoded by the sequence are validated with nucleotide-to-protein alignments using BLAST.

  • [DOI] A. A. Schäffer, E. L. Hatcher, L. Yankie, L. Shonkwiler, R. J. Brister, I. Karsch-Mizrachi, and E. P. Nawrocki, “VADR: validation and annotation of virus sequence submissions to GenBank,” bioRxiv, 2019.
    author = {Alejandro A Schäffer and Eneida L Hatcher and Linda Yankie and Lara Shonkwiler and J Rodney Brister and Ilene Karsch-Mizrachi and Eric P Nawrocki},
    title = {{VADR}: validation and annotation of virus sequence submissions to {GenBank}},
    journal = {{bioRxiv}},
    year = {2019},
    doi = {10.1101/852657},
    publisher = {Cold Spring Harbor Laboratory},
★ V-Pipe | Mining viral genomes and improve clinical diagnostics

V-Pipe has released a new version specifically adapted to analyze high-throughput sequencing data of SARS-CoV-2. It allows for the detection of within-host genetic variation of SARS-CoV-2 from viral NGS data.

  • [DOI] L. A. Carlisle, T. Turk, K. Kusejko, K. J. Metzner, C. Leemann, C. Schenkel, N. Bachmann, S. Posada, N. Beerenwinkel, J. Böni, S. Yerly, T. Klimkait, M. Perreau, D. L. Braun, A. Rauch, A. Calmy, M. Cavassini, M. Battegay, P. Vernazza, E. Bernasconi, H. F. Günthard, R. D. Kouyos, and Swiss HIV Cohort Study, “Viral diversity from next-generation sequencing of HIV-1 samples provides precise estimates of infection recency and time since infection.,” J Infect Dis, 2019.
    author = {Carlisle, Louisa A and Turk, Teja and Kusejko, Katharina and Metzner, Karin J and Leemann, Christine and Schenkel, Corinne and Bachmann, Nadine and Posada, Susana and Beerenwinkel, Niko and Böni, Jürg and Yerly, Sabine and Klimkait, Thomas and Perreau, Matthieu and Braun, Dominique L and Rauch, Andri and Calmy, Alexandra and Cavassini, Matthias and Battegay, Manuel and Vernazza, Pietro and Bernasconi, Enos and Günthard, Huldrych F and Kouyos, Roger D and {Swiss HIV Cohort Study}},
    title = {Viral diversity from next-generation sequencing of {HIV}-1 samples provides precise estimates of infection recency and time since infection.},
    journal = {{J Infect Dis}},
    year = {2019},
    abstract = {HIV-1 genetic diversity increases over the course of infection, and can be used to infer time since infection (TSI) and consequently also infection recency, crucial quantities for HIV-1 surveillance and the understanding of viral pathogenesis. We considered 313 HIV-infected individuals for whom reliable estimates of infection dates and next-generation sequencing (NGS)-derived nucleotide frequency data were available. Fraction of ambiguous nucleotides (FAN) obtained by population sequencing were available for 207 samples. We assessed whether average pairwise diversity (APD) calculated using NGS sequences provided a more exact prediction of TSI and classification of infection recency (<1 year post-infection) compared to FAN. NGS-derived APD classifies an infection as recent with a sensitivity of 88% and specificity of 85%. When considering only the 207 samples for which FAN were available, NGS-derived APD exhibited a higher sensitivity (90% vs 78%) and specificity (95% vs 67%) than FAN. Additionally, APD can estimate TSI with a mean absolute error of 0.84 years, compared to 1.03 years for FAN.},
    doi = {10.1093/infdis/jiz094},
    keywords = {HIV-1; diversity; infection recency; next-generation sequencing; time since infection},
    pmid = {30835266},
★ VIRify
VIRify can be used for the identification of coronaviruses in clinical and environmental samples. VIRify is a recently developed, generic pipeline for the detection, annotation, and taxonomic classification of viral and phage contigs in metagenomic and metatranscriptomic assemblies. VIRify’s taxonomic classification relies on the detection of taxon-specific profile hidden Markov models (HMMs), built upon a set of 22,014 orthologous protein domains and referred to as ViPhOGs. Included in this profile HMM database are 139 models that serve as specific markers for taxa within the Coronaviridae family.
★ Haploflow | detection of multi-strain infections
Haploflow is a novel, de Bruijn graph based assembler for the de novo, strain-resolved assembly of viruses that is able to rapidly resolve differences up to a base-pair level between two viral strains. Haploflow will help advance SARS-CoV-2 research by enabling the detection and full-length reconstruction of SARS-CoV-2 multi-strain infections.
★ VBRC Tools for Coronaviruses
The VBRC was developed for dsDNA viruses but has been adapted for coronaviruses. Only SARS-CoV-2 and closely related viruses will be added to this database. The VBRC provides unique tools that may be useful for the analysis of SARS-CoV-2.
★ VIRULIGN | Fast codon-correct alignment and annotation of viral genomes
VIRULIGN is built for fast codon-correct alignments of large datasets, with standardized and formalized genome annotation and various alignment export formats. VIRULIGN has been adapted to SARS-CoV-2.

  • [DOI] P. J. K. Libin, K. Deforche, A. B. Abecasis, and K. Theys, “VIRULIGN: fast codon-correct alignment and annotation of viral genomes,” Bioinformatics, 2018.
    author = {Pieter J K Libin and Koen Deforche and Ana B Abecasis and Kristof Theys},
    title = {{VIRULIGN}: fast codon-correct alignment and annotation of viral genomes},
    journal = {Bioinformatics},
    year = {2018},
    doi = {10.1093/bioinformatics/bty851},
    editor = {John Hancock},
    publisher = {Oxford University Press ({OUP})},
★ VIGOR4 | Viral Genome ORF Reader
VIGOR4 (Viral Genome ORF Reader) is a Java application to predict protein sequences encoded in viral genomes. VIGOR4 determines the protein coding sequences by sequence similarity searching against curated viral protein databases.

Vigor4 uses the VIGOR_DB project which currently has databases for the following viruses: Influenza (A & B for human, avian, and swine, and C for human), West Nile Virus, Zika Virus, Chikungunya Virus, Eastern Equine Encephalitis Virus, Respiratory Syncytial Virus, Rotavirus, Enterovirus, Lassa Mammarenavirus. SARS-CoV-2 release is coming (May, 1st).

  • [DOI] S. Wang, J. P. Sundaram, and D. Spiro, “VIGOR, an annotation program for small viral genomes,” BMC Bioinf, vol. 11, iss. 1, 2010.
    author = {Shiliang Wang and Jaideep P Sundaram and David Spiro},
    title = {{VIGOR}, an annotation program for small viral genomes},
    journal = {{BMC Bioinf}},
    year = {2010},
    volume = {11},
    number = {1},
    doi = {10.1186/1471-2105-11-451},
    publisher = {Springer Science and Business Media {LLC}},
★ Rfam COVID-19 Resources

In response to the SARS-CoV-2 outbreak, Rfam produced a special release 14.2 that includes 10 new and 4 revised families that can be used to annotate the SARS-CoV-2 and other Coronavirus genomes with RNA families.

RNACentral Betacoronavirus sequence similarity search

RNAcentral has launched a new tool to help scientists studying Coronaviruses to search thousands of viral genomes and explore results by virus, host, country, and more. The new search uses nhmmer and enables the users to explore the results using facets powered by the EBI Search. For example, it is possible to search for a fragment of a SARS-CoV-2 genome, select similar sequences identified in COVID-19 patients, and browse sequence variants from different countries.

The viral sequences are regularly extracted from the NCBI BLAST database and the metadata is retrieved from the associated INSDC entries. While the search uses the same sequences as NCBI BLAST, there are several important differences:

  • Support for facets and keyword searches for exploring results
  • Use nhmmer as the search engine
  • Explore up to 1,000 top hits (compared to 100 in BLAST)
  • Integration with Rfam to annotate structured RNA elements

The search is produced and maintained by the RNAcentral team, repurposing the infrastructure of the RNAcentral sequence similarity search to query betacoronavirus sequences instead of non-coding RNAs.

Protein annotation
Pfam protein families database

The Pfam protein families database is widely used in the field of molecular biology for large-scale functional annotation of proteins. The latest release of Pfam, version 33.1, contains an updated set of models that comprehensively cover the proteins encoded by SARS-CoV-2. The Pfam profile hidden Markov model (HMM) library in combination with the HMMER software facilitates rapid search and annotation of coronaviruses and can be used to generate multiple sequence alignments that allow the identification of mutations and clusters of related sequences, particularly useful for outbreak tracking and studying the evolution of Coronaviruses.

UniProt COVID-19 protein portal | rapid access to protein information

COVID-19 UniProt portal provides early pre-release access to (i) SARS-CoV-2 annotated protein sequences, (ii) closest SARS proteins from SARS 2003, (iii) human proteins relevant to the biology of viral infection, like receptors and enzymes, (iv) ProtVista visualisation of sequence features for each protein, (v) links to sequence analysis tools, (vi) access to collated community-contributed publications relevant to COVID-19, as well as (vii) links to relevant resources.

The COVID-19 portal enables community crowdsourcing of publications via the “Add a publication” feature within any entry. Thus, the community can assist in associating new or missing publications to relevant UniProt entries. ORCID is used as a mechanism to validate user credentials as well as recognition for contribution. Ten publication submissions have been received so far, contributing to our understanding of the virus biology. The COVID-19 UniProt portal advances SARS-CoV-2 research by providing the latest knowledge on proteins relevant to the disease for both the virus and human host.

UniProt also hosted webinars to describe the portal and publication submission system.

ELM and IUPred2A | Detection of functional regions in disordered proteins

The ELM resource offers a collection of short linear motifs that mediate interactions crucial in both native cell regulation and pathogenic interference. The latest 2020 release of ELM includes updated motif definitions and new instances covering nearly 300 motif classes based on over 3,400 publications. The ELM server includes an online tool for the identification of motifs in user input sequences, using context filters (based on structural information, localization, sequence complexity and conservation) to refine results. The opening page has been updated to give references to an overview of motif hijacking in viruses in general, and an application to detecting SARS-CoV-2 motif usage in host cell attachment and modulation.

The IUPred2A server is a biophysics-based prediction to identify intrinsically disordered protein regions (IDRs) and binding sites inside IDRs. This can capture longer and more specialized disordered interaction regions complementing linear motifs captured by ELM. Both ELM and IUPred2A identify functional regions that are disordered and generally less conserved due to rapid evolvability, and hence their outputs complement the well conserved functional regions identified by Pfam. ELM and IUPred2A mutually include each other’s predictions, and both methods utilize the newest data from UniProt and Pfam. This way users can get a fully annotated report on candidate disordered interaction sites, highlighting their relationships with Pfam domains, for any protein of interest.

Tracking, epidemiology and evolution
★ Coronavirus typing tool

The Coronavirus typing tool is designed to use Blast and phylogenetic methods in order to identify the Coronavirus types and genotypes of a nucleotide sequence.

  • [DOI] M. Vilsker, Y. Moosa, S. Nooij, V. Fonseca, Y. Ghysens, K. Dumon, R. Pauwels, L. C. Alcantara, E. V. Eynden, A. Vandamme, K. Deforche, and T. de Oliveira, “Genome Detective: an automated system for virus identification from high-throughput sequencing data,” Bioinformatics, vol. 35, iss. 5, p. 871–873, 2018.
    author = {Michael Vilsker and Yumna Moosa and Sam Nooij and Vagner Fonseca and Yoika Ghysens and Korneel Dumon and Raf Pauwels and Luiz Carlos Alcantara and Ewout Vanden Eynden and Anne-Mieke Vandamme and Koen Deforche and Tulio de Oliveira},
    title = {{Genome Detective}: an automated system for virus identification from high-throughput sequencing data},
    journal = {Bioinformatics},
    year = {2018},
    volume = {35},
    number = {5},
    pages = {871--873},
    doi = {10.1093/bioinformatics/bty695},
    editor = {Inanc Birol},
    publisher = {Oxford University Press ({OUP})},
Covidex | Alignment-free machine learning subtyping for viral species

Covidex is an alignment-free machine learning subtyping tool for viral species, based on a random forest model trained over a kmer database. Currently, it supports FMDV and SARS-Cov-2 viral sequences. The tool allows a fast classification in pre-defined clusters (from the Nextstrain database).

This tool is a work in progress and any suggestions would be greatly appreciated. Please contact Marco Cacciabue

pangolin | Phylogenetic Assignment of Named Global Outbreak LINeages

Pangolin assigns a global lineage to query SARS-CoV-2 genomes by estimating the most likely placement within a phylogenetic tree of representative sequences from all currently defined global SARS-CoV-2 lineages based on the lineage nomenclature proposed by

  • [DOI] A. Rambaut, E. C. Holmes, V. Hill, Á. O’Toole, J. McCrone, C. Ruis, L. du Plessis, and O. G. Pybus, “A dynamic nomenclature proposal for SARS-CoV-2 to assist genomic epidemiology,” bioRxiv, 2020.
    author = {Andrew Rambaut and Edward C. Holmes and Verity Hill and {\'{A}}ine O'Toole and JT McCrone and Chris Ruis and Louis du Plessis and Oliver G. Pybus},
    title = {A dynamic nomenclature proposal for {SARS}-{CoV}-2 to assist genomic epidemiology},
    journal = {{bioRxiv}},
    year = {2020},
    doi = {10.1101/2020.04.17.046086},
    publisher = {Cold Spring Harbor Laboratory},
Nextstrain | Real-time epidemiology tracking of SARS-CoV-2 evolution

Nextstrain is an open-source project to harness the scientific and public health potential of pathogen genome data. They provide a continually-updated view of publicly available data alongside powerful analytic and visualization tools for use by the community. The goal is to aid epidemiological understanding and improve outbreak response.

Nextstrain is incorporating SARS-CoV-2 genomes as soon as they are shared and provides analyses and situation reports.

  • [DOI] J. Hadfield, C. Megill, S. M. Bell, J. Huddleston, B. Potter, C. Callender, P. Sagulenko, T. Bedford, and R. A. Neher, “Nextstrain: real-time tracking of pathogen evolution,” Bioinformatics, vol. 34, iss. 23, p. 4121–4123, 2018.
    author = {James Hadfield and Colin Megill and Sidney M Bell and John Huddleston and Barney Potter and Charlton Callender and Pavel Sagulenko and Trevor Bedford and Richard A Neher},
    title = {Nextstrain: real-time tracking of pathogen evolution},
    journal = {Bioinformatics},
    year = {2018},
    volume = {34},
    number = {23},
    pages = {4121--4123},
    doi = {10.1093/bioinformatics/bty407},
    editor = {Janet Kelso},
    publisher = {Oxford University Press ({OUP})},
★ BEAST 2 | Bayesian evolutionary analysis by sampling trees

BEAST 2 is a cross-platform program for Bayesian phylogenetic analysis of molecular sequences. It estimates rooted, time-measured phylogenies using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST 2 uses Markov chain Monte Carlo (MCMC) to average over tree space, so that each tree is weighted proportional to its posterior probability. BEAST 2 includes a graphical user-interface for setting up standard analyses and a suit of programs for analysing the results.

  • [DOI] R. Bouckaert, T. G. Vaughan, J. Barido-Sottani, S. Duchêne, M. Fourment, A. Gavryushkina, J. Heled, G. Jones, D. Kühnert, N. D. Maio, M. Matschiner, F. K. Mendes, N. F. Müller, H. A. Ogilvie, L. du Plessis, A. Popinga, A. Rambaut, D. Rasmussen, I. Siveroni, M. A. Suchard, C. Wu, D. Xie, C. Zhang, T. Stadler, and A. J. Drummond, “BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis,” PLOS Comput Biol, vol. 15, iss. 4, p. e1006650, 2019.
    author = {Remco Bouckaert and Timothy G. Vaughan and Joëlle Barido-Sottani and Sebasti{\'{a}}n Duch{\^{e}}ne and Mathieu Fourment and Alexandra Gavryushkina and Joseph Heled and Graham Jones and Denise Kühnert and Nicola De Maio and Michael Matschiner and F{\'{a}}bio K. Mendes and Nicola F. Müller and Huw A. Ogilvie and Louis du Plessis and Alex Popinga and Andrew Rambaut and David Rasmussen and Igor Siveroni and Marc A. Suchard and Chieh-Hsi Wu and Dong Xie and Chi Zhang and Tanja Stadler and Alexei J. Drummond},
    title = {{BEAST} 2.5: An advanced software platform for {B}ayesian evolutionary analysis},
    journal = {{PLOS Comput Biol}},
    year = {2019},
    volume = {15},
    number = {4},
    pages = {e1006650},
    doi = {10.1371/journal.pcbi.1006650},
    editor = {Mihaela Pertea},
    publisher = {Public Library of Science ({PLoS})},
★ Phylogeographic reconstruction using air transportation data
Phylogeographic reconstruction using air transportation data can be used to study the global spread of the SARS-CoV-2 pandemic, especially in the early phases when air travel still substantially contributed to the spread of the virus. The method is currently adapted to consider both air travel and local movement data within countries during inference to reflect the changing worldwide movements in different phases of the pandemic.

  • [DOI] S. Reimering, S. Muñoz, and A. C. McHardy, “Phylogeographic reconstruction using air transportation data and its application to the 2009 H1N1 influenza A pandemic,” PLoS Comput Biol, vol. 16, iss. 2, p. e1007101, 2020.
    author = {Susanne Reimering and Sebastian Mu{\~{n}}oz and Alice C. McHardy},
    title = {Phylogeographic reconstruction using air transportation data and its application to the 2009 {H1N1} influenza {A} pandemic},
    journal = {{PLoS Comput Biol}},
    year = {2020},
    volume = {16},
    number = {2},
    pages = {e1007101},
    doi = {10.1371/journal.pcbi.1007101},
    editor = {Cecile Viboud},
    publisher = {Public Library of Science ({PLoS})},
COPASI | Biochemical System Simulator

COPASI is a software application for simulation and analysis of biochemical networks and their dynamics. COPASI supports models in the SBML standard and can simulate their behavior using ODEs or Gillespie’s stochastic simulation algorithm; arbitrary discrete events can be included in such simulations.

★ Covid-19 Simulator
The Covid-19 Simulator is an epidemiological model that is daily updated with data from RKI and Johns Hopkins, and includes projections of hospital resources needed in the next weeks and months as well as effects of contact reduction measures. Currently only in German.
Covid-19 trajectories | Monitoring pandemic in the worldwide context
Covid-19 trajectories is a monitoring tool which enables inspection of the dynamic state of the epidemic in 187 countries using trajectories. They visualize transmission and removal rates of the epidemic and this way bridge epi-curve tracking with modelling approaches.

  • [DOI] H. Loeffler-Wirth, M. Schmidt, and H. Binder, “Covid-19 trajectories: monitoring pandemic in the worldwide context,” medRxiv, 2020.
    author = {Henry Loeffler-Wirth and Maria Schmidt and Hans Binder},
    title = {Covid-19 trajectories: Monitoring pandemic in the worldwide context},
    journal = {{medRxiv}},
    year = {2020},
    doi = {10.1101/2020.06.04.20120725},
    publisher = {Cold Spring Harbor Laboratory},
★ CoV-GLUE | Amino acid analysis for the SARS-CoV-2 pandemic

SARS-CoV-2 will naturally accumulate nucleotide mutations in its RNA genome as the pandemic progresses. On average the observed changes would be expected to have no or minimal consequence on virus biology. However, tracking these changes will help us better understand the virus pandemic.

CoV-GLUE is an online web application for the interpretation and analysis of SARS-CoV-2 virus genome sequences, with a focus on amino acid sequence variation. It is based on the GLUE data-centric bioinformatics environment and provides a browsable database of amino acid replacements and coding region indels that have been observed in sequences from the pandemic. Users may also analyse their own SARS-CoV-2 sequences by submitting them to the web application to receive an interactive report containing visualisations of phylogenetic classification and highlighting genomic variation of potentially high impact, for example, linked to primer mismatches.

  • [DOI] J. Singer, R. J. Gifford, M. Cotten, and D. L. Robertson, “CoV-GLUE: a web application for tracking SARS-CoV-2 genomic variation,” Preprints, 2020.
    author = {Joshua Singer and Robert J. Gifford and Matthew Cotten and David L. Robertson},
    title = {{CoV}-{GLUE}: A Web Application for Tracking {SARS}-{CoV}-2 Genomic Variation},
    journal = {Preprints},
    year = {2020},
    doi = {10.20944/preprints202006.0225.v1},
    publisher = {{MDPI} {AG}},

coronapp is a web application to annotate and monitor SARS-CoV-2 mutations, both worldwide and in user-selected countries. The tool allows users to highlight and prioritize the most frequent mutations in specific protein regions, and to monitor their frequency in the population over time.

  • [DOI] D. Mercatelli, L. Triboli, E. Fornasari, F. Ray, and F. M. Giorgi, “Coronapp: a web application to annotate and monitor SARS-CoV-2 mutations,” bioRxiv, 2020.
    author = {Daniele Mercatelli and Luca Triboli and Eleonora Fornasari and Forest Ray and Federico M. Giorgi},
    title = {coronapp: A Web Application to Annotate and Monitor {SARS}-{CoV}-2 Mutations},
    journal = {{bioRxiv}},
    year = {2020},
    doi = {10.1101/2020.05.31.124966},
    publisher = {Cold Spring Harbor Laboratory},
★ PoSeiDon | Positive Selection Detection and recombination analysis of protein-coding genes

PoSeiDon is a pipeline to detect significant positively selected sites and possible recombination events in an alignment of multiple coding sequences. Sites that undergo positive selection can give you insights in the evolutionary history of your sequences, for example showing you important mutation hot spots, accumulated as results of virus-host arms races during evolution.

Drug Design
★ VirHostNet 2.0 | Coronaviridae / host protein-protein interactions

VirHostNet is a knowledge base for the management and the analysis of proteome-wide virus-host interaction networks. The latest release (March 2020) is dedicated to helping researchers to fight against the COVID-19 pandemic and is registered as a COVID-19 fair share resource. The release contains manual biocuration of experimentally validated physical Coronaviridae / host protein-protein interactions and is complementary to functional and in silico interactions shared by viruses.string.

  • [DOI] V. Navratil, B. de Chassey, L. Meyniel, S. Delmotte, C. Gautier, P. André, V. Lotteau, and C. Rabourdin-Combe, “VirHostNet: a knowledge base for the management and the analysis of proteome-wide virus-host interaction networks,” Nucleic Acids Res, vol. 37, iss. suppl{_}1, p. D661–D668, 2008.
    author = {Vincent Navratil and Beno{\^{\i}}t de Chassey and Laur{\`{e}}ne Meyniel and St{\'{e}}phane Delmotte and Christian Gautier and Patrice Andr{\'{e}} and Vincent Lotteau and Chantal Rabourdin-Combe},
    title = {{VirHostNet}: a knowledge base for the management and the analysis of proteome-wide virus-host interaction networks},
    journal = {{Nucleic Acids Res}},
    year = {2008},
    volume = {37},
    number = {suppl{\_}1},
    pages = {D661--D668},
    doi = {10.1093/nar/gkn794},
    publisher = {Oxford University Press ({OUP})},
★ CORDITE | CORona Drug InTEractions database

CORDITE (CORona Drug InTEractions database) collects and aggregates data from PubMed, MedRxiv, BioRxiv, and ChemRxiv for SARS-CoV-2. Its main focus is set on drug interactions either addressing viral proteins or human proteins that could be used to treat COVID. It collects and provides up-to-date information on computational predictions, in vitro, as well as in vivo study data.

The information provided is for research only and we cannot guarantee the correctness of the data.

CoVex | Coronavirus Explorer

CoVex is a unique online network and systems medicine platform for data analysis that integrates virus-human interactions for SARS-CoV-2 and SARS-CoV-1. It implements different network-based approaches for the identification of new drug targets and new repurposable drugs. Explore the virus-host interactome. Find putative drug targets. Repurpose drugs.

P-HIPSTer | Pathogen host interactome prediction using structure similarity

P-HIPSTer employs structural information to predict pan viral-human PPIs. The P-HIPSTer database, comprised of ∼282,000 PPIs, represents a comprehensive catalog of virus-human PPIs that spans the Baltimore classification system and is a major expansion on previously available or reported pathogen-host interactions. P-HIPSTer contains predicted viral-host PPIs for 15 human-infecting coronaviruses.

  • [DOI] G. Lasso, S. V. Mayer, E. R. Winkelmann, T. Chu, O. Elliot, J. A. Patino-Galindo, K. Park, R. Rabadan, B. Honig, and S. D. Shapira, “A structure-informed atlas of human-virus interactions,” Cell, vol. 178, iss. 6, p. 1526–1541.e16, 2019.
    author = {Gorka Lasso and Sandra V. Mayer and Evandro R. Winkelmann and Tim Chu and Oliver Elliot and Juan Angel Patino-Galindo and Kernyu Park and Raul Rabadan and Barry Honig and Sagi D. Shapira},
    title = {A Structure-Informed Atlas of Human-Virus Interactions},
    journal = {Cell},
    year = {2019},
    volume = {178},
    number = {6},
    pages = {1526--1541.e16},
    doi = {10.1016/j.cell.2019.08.005},
    publisher = {Elsevier {BV}},
Chemical Checker COVID-19 drug candidates
Through a review of the most relevant scientific literature, and considering different levels of experimental evidence, over 150 compounds have been identified that are potentially active against COVID-19. This literature curation effort is now exploited with ChemicalChecker to identify other compounds with the potential to be effective against COVID-19.
Coronavirus Vaccine Tracker
Coronavirus Vaccine Tracker by NY Times summarizes the status of all the vaccines that have reached trials in humans, along with a selection of promising vaccines still being tested in cells or animals.
In-host dynamics
COVID19 tissue simulator | Prototype 2-D multicellular simulation of COVID19
COVID19 tissue simulator simulates replication dynamics of SARS-CoV-2 in a layer of epithelium. It is being rapidly prototyped and refined with community support.

In this model, SARS-CoV-2 infects a single cell, or a solution of virions is administered to the extracellular space. The virus is uncoated to explose viral RNA, which synthesizes viral proteins that are assembled into a virion. Assembled virions are exported to the environment, where they can diffuse and infect other cells. In the extracellular space, virions adhere to ACE2 receptors and get internalized through endocytosis. Internalized ACE2 receptors release their virus cargo and are recycled back to the surface.

The model includes a basic pharmacodynamic response (to assembled virions) to cause cell apoptosis. Apoptosed cells release some or all of their internal contents, notably including virions.

  • [DOI] Y. Wang, G. An, A. Becker, C. Cockrell, N. Collier, M. Craig, C. L. Davis, J. Faeder, A. F. N. Versypt, J. F. Gianlupi, J. A. Glazier, R. Heiland, T. Hillen, M. A. Islam, A. Jenner, B. Liu, P. A. Morel, A. Narayanan, J. Ozik, P. Rangamani, J. E. Shoemaker, A. M. Smith, and P. Macklin, “Rapid community-driven development of a SARS-CoV-2 tissue simulator,” , 2020.
    author = {Yafei Wang and Gary An and Andrew Becker and Chase Cockrell and Nicholson Collier and Morgan Craig and Courtney L. Davis and James Faeder and Ashlee N. Ford Versypt and Juliano F. Gianlupi and James A. Glazier and Randy Heiland and Thomas Hillen and Mohammad Aminul Islam and Adrianne Jenner and Bing Liu and Penelope A Morel and Aarthi Narayanan and Jonathan Ozik and Padmini Rangamani and Jason Edward Shoemaker and Amber M. Smith and Paul Macklin},
    title = {Rapid community-driven development of a {SARS}-{CoV}-2 tissue simulator},
    year = {2020},
    doi = {10.1101/2020.04.02.019075},
    publisher = {Cold Spring Harbor Laboratory},