To keep you up to date with the latest developments in virus bioinformatics, especially new tools that might help you in your research, the European Virus Bioinformatics Center is organising a monthly lecture series entitled viruses in silico.
The lecture will take place online as a Zoom Meeting. Participation is free. Use the registration form to receive the login details. Registration is possible until 1 hour before the start of the event.
Viral reference database as a critical factor for clinical metagenomics: a review using Virosaurus as an example.
14. June 2021 | 04-05 pm CEST
Dr. Philippe Le Mercier, SIB Swiss Institute of Bioinformatics, Switzerland
High-Throughput Sequencing (HTS) technology can detect all genetic entities in a clinical sample. A key element in identifying pathogens in HTS output is the reference database. However, building a comprehensive viral database is challenging: there are 1,561 known vertebrate viral species. Moreover, sequence conservation within a viral species can be as low as 50%. If a human reference genome is sufficient to identify all human reads, hundreds of rotavirus references are needed to identify all circulating viruses of that species. Therefore, a viral reference database must cover all viral species and their inherent variability to be effective for analysis of clinical samples.
In this talk, I will present the challenges and pitfalls of viral genomics: why it is complex to represent viral diversity in a viral reference database. There are more than 10 datasets available, all developed from different viewpoints: from simple queries on GenBank to datasets processed using all public sequences. I will discuss the process used to generate Virosaurus: a database representing all publicly available vertebrate/plant/fungal viral genomes; and the particular case of segmented viruses.