To keep you up to date with the latest developments in virus bioinformatics, especially new tools that might help you in your research, the European Virus Bioinformatics Center is organising a monthly lecture series entitled viruses in silico.
The lecture will take place online as a Zoom Meeting. Participation is free. Use the registration form to receive the login details. Registration is possible until 1 hour before the start of the event.
CIAlign, a tool to clean, interpret and visualise multiple sequence alignments, and its application to virus discovery.
26. April 2021 | 02-03 pm CEST
Dr. Katy Brown, Cambridge University, UK
Many applications of multiple sequence alignments (MSA) involve working with sequences which are not ideal. Sequences can be incomplete, contain errors or be highly divergent with many mismatches. This is very common when working with high throughput sequencing data, for example in alignments based on de novo assembled transcripts, long read sequencing reads or mixed metagenomic datasets. MSAs based on these sequences often contain many gaps and areas of low quality alignment. It’s still common to manually edit MSAs before performing further analysis but this method is time-consuming and not easily reproducible.
In the first part of this seminar I will discuss a new command-line tool, CIAlign, which we have developed to automatically solve some of the most common problems with MSAs. CIAlign targets four common features of complex MSAs: low quality or incomplete ends of sequences leading to gaps and mismatches, insertions in a minority of sequences dominating the alignment, unexpectedly divergent sequences and very short sequences. It also provides a new type of alignment visualisation, which shows the whole alignment in a single, publication ready image. I will then discuss how I use CIAlign on a day-to-day basis, as part of a computational pipeline developed to identify, classify and characterise novel RNA viruses and cross-species transmissions in honey bees and the other arthropods they interact with, such as native pollinators, ants and mites.
Genomic surveillance of SARS-CoV-2 at the Robert Koch Institute: an overview and bioinformatics edge cases
17. Mai 2021 | 04-05 pm CEST
Dr. Martin Hölzer, Robert-Koch-Institute, Germany
It has been more than a year since the WHO declared a pandemic on March 11, 2020, as a novel coronavirus, SARS-CoV-2, spread worldwide. A significantly increased sequencing effort is currently underway to track ongoing viral evolution and spread while monitoring mutations in the virus genome in response to the pandemic. Genomic sequences of the virus and their structured and continuous analysis form the basis for these critical molecular investigations.
In this talk, I will present the ongoing genomic surveillance efforts of SARS-CoV-2 sequences at the Robert Koch Institute (RKI), Germany’s national public health institute. In mid-January, the Federal Ministry of Health issued a decree requiring laboratories to submit reconstructed SARS-CoV-2 genomes to the RKI to improve genomic surveillance. The number of sequences increased tremendously. Reconstructed genomes of virus-positive samples sequenced directly at RKI further enrich this set. In this talk, I will discuss the various bioinformatics tasks and challenges involved in the daily reconstruction, quality control, annotation, profiling, and clustering of these sequences. Besides, I will discuss important bioinformatics edge cases we have discovered that, if not addressed appropriately, may even lead to misclassification of variants or virus lineages of concern.