This is a collection of useful tools in Virus Bioinformatics. Please note, that EVBC is not maintaining these tools.
Tools by EVBC members are marked ★.
Don’t hesitate to contact us if you want a tool to be added. We are also happy, to receive feedback on the tools!
De novo assembly
Sequencing and annotation
The comparative gene prediction algorithm of AUGUSTUS performs a multi-genome annotation to increase the accuracy and consistency of the predicted exon-intron structures of the protein-coding genes by simultaneously predicting the genes in all input genomes.
Base-By-Base is a comprehensive tool for the creation and editing of multiple sequence alignments. It can be used with gene and protein sequences as well as with large viral genomes, which themselves can contain gene annotations. New features: (1) “consensus-degenerate hybrid oligonucleotide primers” (CODEHOP), a popular tool for the design of degenerate primers from a multiple sequence alignment of proteins; and (2) the ability to perform fuzzy searches within the columns of sequence data in multiple sequence alignments to determine the distribution of sequence variants among the sequences.
Systematically screening of genomic ‘dark matter’ to recover useful biological information using sequence similarity search tools and a relational database. DIGS can be used to systematically search for sequences of interest, and to support investigations of their distribution, diversity and evolution. One example is the screening for endogenous viral elements (EVEs) in mammalian genomes.
Secondary structure prediction
Virus genotyping and diagnosis
Find a list of virus genotyping tools here.
geno2pheno[ngs-freq] is a web service for rapidly identifying drug resistance in HIV-1 and HCV samples by relying on frequency files that provide the read counts of nucleotides or codons along a viral genome. geno2pheno[ngs-freq] can assist clinical decision making by enabling users to explore resistance in viral populations with different abundances.
To detect viral pathogens in time-critical scenarios, accurate and fast diagnostic assays are required. Such assays can now be established using mass spectrometry-based targeted proteomics, by which viral proteins can be rapidly detected from complex samples down to the strain-level with high sensitivity and reproducibility. Purple is a software tool for selecting target-specific peptide candidates directly from given proteome sequence data. It comes with an intuitive graphical user interface, various parameter options and a threshold-based filtering strategy for homologous sequences. Purple enables peptide candidate selection across various taxonomic levels and filtering against backgrounds of varying complexity.
Phylogenetic and phylodynamic inference
Bayesian Evolutionary Analysis by Sampling Trees (BEAST) is a primary tool for Bayesian phylogenetic and phylodynamic inference from genetic sequence data. BEAST unifies molecular phylogenetic reconstruction with complex discrete and continuous trait evolution, divergence-time dating, and coalescent demographic models in an efficient statistical inference engine using Markov chain Monte Carlo integration. BEAST 1.10 focusses on delivering accurate and informative insights for infectious disease research through the integration of diverse data sources, including phenotypic and epidemiological information, with molecular evolutionary models.
Some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. BEAST 2 is a computational software platform, that allows robust development of compatible (sub-)models which can be composed into a full model hierarchy.
A method for comparing phylogeographies across different trees inferred from the same taxa. reconstruct the origin and spread of taxa by inferring locations for internal nodes of the phylogenetic tree from sampling locations of genetic sequences. This is commonly applied to study pathogen outbreaks and spread.
SANTA-SIM is a software package to simulate the evolution of a population of gene sequences forwards through time. It models the underlying biological processes as discrete components: replication, recombination, point mutations, insertion-deletions, and selection under various fitness models and population size dynamics. The software is designed to be intuitive to work with for a wide range of users and executable in a cross-platform manner.
Sweep Dynamics (SD) plots is a computational method combining phylogenetic algorithms with statistical techniques to characterize the molecular adaptation of rapidly evolving viruses from longitudinal sequence data. SD plots facilitate the identification of selective sweeps, the time periods in which these occurred and associated changes providing a selective advantage to the virus.
vConTACT is a network-based application utilizing whole genome gene-sharing profiles for virus taxonomy that integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions.
VirSorter is a web-based tool designed to detect viral signal in these different types of microbial sequence data in both a reference-dependent and reference-independent manner, leveraging probabilistic models and extensive virome data to maximize detection of novel viruses. VirSorter outperforms existing tools in predicting viral sequences outside of a host genome (i.e., from extrachromosomal prophages, lytic infections, or partially assembled prophages) and for fragmented genomic and metagenomic datasets. Because VirSorter scales to large datasets, it can also be used in “reverse” to more confidently identify viral sequence in viral metagenomes by sorting away cellular DNA whether derived from gene transfer agents, generalized transduction or contamination.
Please also have a look at the following publication for assessing metagenomic assemblers: