On 31 July – 2 August 2023, the ICTV, EVBC, and NFDI4Microbiota co-organised a Workshop on Automating Virus Taxonomy in Jena. There were almost 400 registrations from 46 countries on all continents (except Antarctica) highlighting the huge interest in this important topic from a global community of virus taxonomy enthusiasts. The current virus taxonomy is characterised by a patchwork of methods that capture the features of different viral lineages. However, these methods are not reproducibly implemented, making it difficult to classify unknown viral sequences, such as those derived from metagenomics data. On the first two days of the Jena Workshop, virology experts explained how they classified divergent types of viruses, and bioinformaticians presented methods to cluster and classify viral sequences. On the third day, a group of enthusiasts rolled up their sleeves during the Hands-On Workshop, consisting of a Snakemake tutorial and creating pipelines to automate classification.
Results of the hackathon
We are pleased to share the results of the hackathon: two Snakemake workflows for Caudoviricetes classification that follow best coding practices! Take a peek at the mashup-phage and download-and-reannotation-of-phage-genomes workflows on the GitHub repository of the ICTV Virus Bioinformatics Enthusiasts Group. The Snakemake material and slides are available here and here. If you are interested in viewing the slides and/or recordings of other presentations, please contact the organizers.
|When:||31 July — 2 August 2023|
|Live:||Rosensäle, Fürstengraben, Jena, Germany|
|Online:||Hybrid attendance via Zoom|
|Cost:||Participation is free of charge|
There is a consensus that viruses are so diverse that no single taxonomic method can be used to classify them all (Simmonds et al. PLoS Biology 2023). Since its inception, the ICTV has been seeking the expertise of the global virology community to classify viruses in accordance with their domain-specific knowledge. This has generated a patchwork of methods that, ideally, capture the features of different viral lineages and generate meaningful taxa that are in agreement with biology. These methods are formalised in taxonomy proposals (taxoprops) written by experts and ratified by the ICTV. These documents describe how viruses within each taxon shall be classified, and include specific demarcation criteria. They are available as Word documents on the ICTV website.
As metagenomics is rapidly expanding our view of the virosphere, we are looking to make sense of the sequences we discover. The number and diversity of sequences found in viromics dataset is staggering and make their taxonomic classification a daunting task that one would ideally automate. There is currently no ICTV-approved method to approach this question. While the solution will likely not be trivial, we have to face this challenge to keep up with the growth of viruses that we aim to classify.
The demarcation criteria are quite diverse and not encoded in a machine-readable way. We envision a future where the demarcation criteria for all taxoprops are implemented in reproducible computational pipelines, allowing viral sequences to be readily classified into taxa at all ranks. The virology experts, including ICTV Study Group members, use specialised methods for taxonomic classification of their viruses. However, these methods might not be readily reproducible by others. At the same time, bioinformaticians develop automated tools that can be readily installed and run, but might not be able to classify all viruses consistently with the virology experts. Through this ICTV/EVBC Workshop on Automating Virus Taxonomy, we propose to reinforce the links between these expertises, and explore ideas for encoding the demarcation criteria in a reproducible way so that they can be applied at a large scale.
The workshop will be hybrid and we hope to see many of you either live or online!
On the first day, a keynote lecture by Prof. Alexander E. Gorbalenya (Leiden University, the Netherlands) will bring us all on the same page. He will explain why we need taxonomy, sketch the history of ideas that have driven virus taxonomy and the ICTV, and highlight our current challenges. After that, we will invite several virology experts to explain the ins and outs of classifying specific viruses. On the second day, we will invite bioinformaticians to explain how they have implemented automated classification methods. We aim to bring both groups together for discussion sessions with the goal to start or deepen collaborations between participants. This will help us sketch guidelines towards a synthesis of methods to address this challenge.
In addition to this program, we also invite you to join satellite hands-on sessions (2 August 2023) that will focus on automating bioinformatic workflows (Snakemake) and virus databases (organized by NFDI4Microbiota).
Monday, 31 July — Taxonomy experts
|13:00||Keynote: Toward defining goals and meeting challenges of virus taxonomy by using bioinformatics
|14:00||From virus to taxonomy to publication: ICTV data
|14:30||From plant viruses to viruses of microbes: A long and winding taxonomic road
|15:00||Geminiviridae: Advances and challenges in the classification of one of the most diverse families of viruses
|16:00||Comparative Genomics shows Filamentous dsDNA Viruses associated with Parasitoid Wasps form a Novel Family among Naldaviricetes
|16:30||Classification of bacterial viruses in the class Caudoviricetes
|17:00||The Parvoviridae: a ssDNA virus family united in diversity
|17:30||Closing discussion: When/where is automation desired/feasibile?|
Tuesday, 01 August — Bioinformatics experts
|09:00||GRAViTy: virus identification and classification framework based on the analysis of whole virus genomes
|09:30||The principles of taxonomic classification and the VICTOR approach for virus classification
|10:00||Protein-based hierarchical clustering of (bacterial and archaeal) viruses and the need for parameter standardization
|11:00||Automatic delineation of viral species and higher ranks
|11:30||You can move, but you can’t hide: identification of mobile genetic elements with geNomad
|12:00||vConTACT3: A centralized and automated platform to systematically and hierarchically classify DNA viruses
|14:00||Viral classification at different taxonomic levels using profile HMMs
|14:30||Update on the ICTV Taxonomy Challenge
Cédric Lood/Bas E. Dutilh
|15:00||Closing discussion: What are the possibilities and bottlenecks of automating virus taxonomy?|
Wednesday, 02 August — Workshops
David Lähnemann, Noriko Cassman, Hamdiye Uzuner, Shahram Saghaei, Judith Stecklina, Cédric Lood
|09:00 – 09:30||Introduction to the workshop setup and Snakemake|
|09:30 – 11:00||Beginner: Basic tutorial
Intermediate: Advanced tutorial
Advanced: Workflow hackathon
|11:00 – 12:00||Virus database overview|
|12:00 – 13:00||lunch break|
|13:00 – 13:30||Overview of hackathon work packages + divide into groups|
|13:30 – 17:00||Beginner: Work through advanced
Intermediate & Advanced: Workflow hackathon
|17:00 – 18:00||Wrap-up
Brief update from each workflow group
Organizing committee and contacts
This workshop is co-organized by members of the ICTV, EVBC and NFDI4Microbiota, please feel free to contact us for any questions: