Biography

At the DOE Joint Genome Institute, I lead the Viral Genomics group where we explore viruses of microbes and their impacts on ecosystems using (mostly) fancy ‘omics tools. Our current projects include the study of viral diversity and virus:host interactions in soil and freshwater environments, along with the development of new bioinformatics tools and experimental protocols to probe and characterize uncultivated viruses. We also assist users of the JGI Metagenome Program with their analysis, including identification of viral sequences, functional annotation, taxonomic classification, etc.

The long-term goal of my research is to understand the ecological and evolutionary drivers of virus:host dynamics in natural microbial communities. This research involves a mix of experimental and computational approaches spanning from the molecular to the ecosystem scale, trying to address fundamental questions like “how do viruses spread and adapt across environments ?”, “how do viruses take over and reprogram microbial cells ?”, and “how do viral infections alter ecosystem processes ?”.

Interests

  • Microbial/Viral Ecology
  • (Viral) Metagenomics
  • Virus Evolution

Education

  • PhD in Microbial Ecology, 2013

    Université Blaise Pascal, Clermont-Ferrand II (now Université Clermont Auvergne)

  • MSc in Data Analysis and Modeling for Life Sciences, 2010

    Université Blaise Pascal, Clermont-Ferrand II (now Université Clermont Auvergne)

  • BSc in Microbiology, 2008

    Université Blaise Pascal, Clermont-Ferrand II (now Université Clermont Auvergne)

Current Projects

Picture of East River hillslope and Green Butte biocrust - Virus-driven alterations of microbial metabolism in soil
We plan to study viral diversity and virus:host interactions in two model systems (East River hillslopes and Green Butte biocrusts) from the single-cell to the ecosystem level. Project goals include (i) characterizing new mechanisms by which viruses transform microbial cells, (ii) investigating how how virus:host interactions are transformed by changes in local environmental conditions, (iii) developing innovative methods to measure viral presence and activity in natural soil systems. 5-year project funded by the DOE Early Career Research Program, in collaboration with the Brodie Lab and the Northen Lab. More information in our latest pre-print - https://doi.org/10.1101/2023.03.06.531389 (Pictures courtesy of Tamy Swenson & Brodie Lab)

Picture of Mushroom Spring in YNP - Virus:host dynamics in Yellowstone National Park biofilms
We study virus:host dynamics across diel cycles in Octopus and Mushroom springs based on coupled metagenomics, metatranscriptomics, and viral metagenomics, to better understand phage infection triggers and synchronization in natural communities. In collaboration with the Bhaya Lab. (Picture: USGS / Thomas Brock)

Logo of the iPHoP tool - Defining the healthy human virome
The JGI Viral Genomics group is involved in the new NIH-funded Human Virome Program https://commonfund.nih.gov/humanvirome. We will help develop new tools and resources for viral taxonomy databases, host prediction, and viral genome annotation.

MVP pipeline logo - Establishing the foundations of a high-throughput phage foundry
Analyzing and modeling phage diversity and phage:host interactions to better understand how microbiomes can be altered and manipulated through the addition of (engineered) phages. Project led by Vivek Mutalik

Illustration of viral capsids - IMG/VR - Large-scale exploration of uncultivated viral diversity
We routinely mine public genomes, metagenomes, and metatranscriptomes for new viral sequences to progressively build a large and comprehensive genomic catalog of the virosphere. IMG/VR v4 now released ! More into in the tools section (Viral capsids drawing from Leah Pantea / http://leahpantea.com)

Tools

Teaching and other Online Resources

  • Viromics workshop The (somewhat) annual workshop dedicated to viromics analysis including viral genome assembly, identification, annotation, curation, and taxonomic classification. Hosted at Ohio State University - https://u.osu.edu/viruslab/viromics-workshop/
  • MGM Workshop Bi-annual highly hands-on workshop designed to familiarize users with the Integrated Microbial Genomes & Microbiomes (IMG/M) data and workflows for computational analysis and interpretation of sequence data, including IMG/VR. https://mgm.jgi.doe.gov
  • VERVE Net Collection of news, protocols, and online discussion for viral ecologists - https://www.protocols.io/groups/verve-net

Selected media coverage

1 / 5
2 / 5
3 / 5
4 / 5
5 / 5

JGI Podcast - Genome Insider
Nature News - Amy Maxmen - 19 March 2018
Wired - Shara Tonn - 03 Sept 2015
Nature Biotechnology - Charles Schmidt - 01 Oct 2018
Comminucations of the ACM - Chris Edwards - Dec 2018

Recent Publications

A functional microbiome catalogue crowdsourced from North American rivers

Predicting elemental cycles and maintaining water quality under increasing anthropogenic influence requires knowledge of the spatial drivers of river microbiomes. However, understanding of the core microbial processes governing river biogeochemistry is hindered by a lack of genome-resolved functional insights and sampling across multiple rivers. Here we used a community science effort to accelerate the sampling, sequencing and genome-resolved analyses of river microbiomes to create the Genome Resolved Open Watersheds database (GROWdb). Building on the previously conceived River Continuum Concept, we layer on microbial functional trait expression, which suggests that the structure and function of river microbiomes is predictable. We make GROWdb available through various collaborative cyberinfrastructures, so that it can be widely accessed across disciplines for watershed predictive modelling and microbiome-based management practices.

Tapping the treasure trove of atypical phages

With advancements in genomics technologies, a vast diversity of ‘atypical’ phages, that is, with single-stranded DNA or RNA genomes, are being uncovered from different ecosystems. In this review, we call for the development of generalizable experimental methods to better capture this understudied viral diversity via isolation and study them through gene-level characterization and engineering. Establishing a diverse set of new ‘atypical’ phage model systems has the potential to provide many new biotechnologies, including potential uses of these atypical phages in halting the spread of antibiotic resistance and engineering of microbial communities for beneficial outcomes.

MVP: a modular viromics pipeline to identify, filter, cluster, annotate, and bin viruses from metagenomes.

While numerous computational frameworks and workflows are available for recovering prokaryote and eukaryote genomes from metagenome data, only a limited number of pipelines are designed specifically for viromics analysis. Here, we describe Modular Viromics Pipeline (MVP) v.1.0, a user-friendly pipeline written in Python and providing a simple framework to perform standard viromics analyses. Overall, MVP provides a standardized and reproducible pipeline for both extensive and robust characterization of viruses from large-scale sequencing data including metagenomes, metatranscriptomes, viromes, and isolate genomes. MVP is available at https://gitlab.com/ccoclet/mvp and as versioned packages in PyPi and Conda.

Unraveling the functional dark matter through global metagenomics

Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.

Viruses interact with hosts that span distantly related microbial domains in dense hydrothermal mats

Many microbes in nature reside in dense, metabolically interdependent communities. We investigated the nature and extent of microbe-virus interactions in relation to microbial density and syntrophy by examining microbe-virus interactions in a biomass dense, deep-sea hydrothermal mat. Using metagenomic sequencing, we find numerous instances where phylogenetically distant (up to domain level) microbes encode CRISPR-based immunity against the same viruses in the mat. Evidence of viral interactions with hosts cross-cutting microbial domains is particularly striking between known syntrophic partners, for example those engaged in anaerobic methanotrophy. We propose that the entry of viral particles and/or DNA to non-primary host cells may be a common phenomenon in densely populated ecosystems, with eco-evolutionary implications for syntrophic microbes and CRISPR-mediated inter-population augmentation of resilience against viruses.

You can move, but you can’t hide: identification of mobile genetic elements with geNomad

Identifying and characterizing mobile genetic elements (MGEs) in sequencing data is essential for understanding their diversity, ecology, biotechnological applications, and impact on public health. Here, we introduce geNomad, a classification and annotation framework that combines information from gene content and a deep neural network to identify sequences of plasmids and viruses. In benchmarks that included diverse MGE and chromosome sequences, geNomad significantly outperformed other tools in all evaluated clades of plasmids and viruses. Leveraging geNomad’s speed and scalability, we were able to process public metagenomes and metatranscriptomes, leading to the discovery of millions of new viruses and plasmids that are available through the IMG/VR and IMG/PR databases. We anticipate that geNomad will enable further advancements in MGE research, and it is available at https://portal.nersc.gov/genomad.

Virus diversity and activity is driven by snowmelt and host dynamics in a high-altitude watershed soil ecosystem

Viruses, including phages, impact nearly all organisms on Earth, including microbial communities and their associated biogeochemical processes. Here, we investigated the diversity and activity of environmental DNA and RNA viruses, including phages, across dynamics seasonal changes in a snow-dominated mountainous watershed by examining paired metagenomes and metatranscriptomes. Taken together, these results suggest that the high diversity of viruses in soils is likely associated with a broad range of host interaction types each adapted to specific host ecological strategies and environmental conditions. Moving forward, integrating these viral impacts in complex natural microbiome models will be key to accurately predict ecosystem biogeochemistry.

IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata

Viruses are widely recognized as critical members of all microbiomes. Metagenomics enables large-scale exploration of the global virosphere, progressively revealing the extensive genomic diversity of viruses on Earth and highlighting the myriad of ways by which viruses impact biological processes. IMG/VR provides access to the largest collection of viral sequences obtained from (meta)genomes, along with functional annotation and rich metadata. Here, we present the fourth version of IMG/VR, composed of >15 million virus genomes and genome fragments, a ≈6-fold increase in size compared to the previous version. IMG/VR v4 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR.

Expansion of the global RNA virome reveals diverse clades of bacteriophages

High-throughput RNA sequencing offers broad opportunities to explore the Earth RNA virome. Mining 5,150 diverse metatranscriptomes uncovered >2.5 million RNA virus contigs. Analysis of >330,000 RNA-dependent RNA polymerases (RdRPs) shows that this expansion corresponds to a 5-fold increase of the known RNA virus diversity. Gene content analysis revealed multiple protein domains previously not found in RNA viruses and implicated in virus-host interactions. Extended RdRP phylogeny supports the monophyly of the five established phyla and reveals two putative additional bacteriophage phyla and numerous putative additional classes and orders. The dramatically expanded phylum Lenarviricota, consisting of bacterial and related eukaryotic viruses, now accounts for a third of the RNA virome. Identification of CRISPR spacer matches and bacteriolytic proteins suggests that subsets of picobirnaviruses and partitiviruses, previously associated with eukaryotes, infect prokaryotic hosts.

iPHoP: an integrated machine-learning framework to maximize host prediction for metagenome-assembled virus genomes

The extraordinary diversity of viruses infecting bacteria and archaea is now primarily studied through metagenomics. While metagenomes enable high-throughput exploration of the viral sequence space, metagenome-derived genomes lack key information compared to isolated viruses, in particular host association. Different computational approaches are available to predict the host(s) of uncultivated viruses based on their genome sequences, but thus far individual approaches are limited either in precision or in recall, i.e. for a number of viruses they yield erroneous predictions or no prediction at all. Here we describe iPHoP, a two-step framework that integrates multiple methods to provide host predictions for a broad range of viruses while retaining a low (<10%) false-discovery rate. Based on a large database of metagenome-derived virus genomes, we illustrate how iPHoP can provide extensive host prediction and guide further characterization of uncultivated viruses. iPHoP is available at https://bitbucket.org/srouxjgi/iphop, through a Bioconda recipe, and a Docker container.

Ecology and molecular targets of hypermutation in the global microbiome

Changes in the sequence of an organism’s genome, i.e., mutations, are the raw material of evolution. The frequency and location of mutations can be constrained by specific molecular mechanisms, such as diversity-generating retroelements (DGRs). DGRs have been characterized from cultivated bacteria and bacteriophages, and perform error-prone reverse transcription leading to mutations being introduced in specific target genes. DGR loci were also identified in several metagenomes, but the ecological roles and evolutionary drivers of these DGRs remain poorly understood. Here, we analyze a dataset of >30,000 DGRs from public metagenomes, establish six major lineages of DGRs including three primarily encoded by phages and seemingly used to diversify host attachment proteins, and demonstrate that DGRs are broadly active and responsible for >10% of all amino acid changes in some organisms. Overall, these results highlight the constraints under which DGRs evolve, and elucidate several distinct roles these elements play in natural communities.

Host population diversity as a driver of viral infection cycle in wild populations of green sulfur bacteria with long standing virus-host interactions

Viral infections of bacterial hosts range from highly lytic to lysogenic, where highly lytic viruses undergo viral replication and immediately lyse their hosts, and lysogenic viruses have a latency period before replication and host lysis. While both types of infections are routinely observed in the environment, the ecological and evolutionary processes that regulate these different viral dynamics are still not well understood. In this study, we identify and characterize the long-term dynamics of uncultivated viruses infecting green sulfur bacteria (GSB) in a model freshwater lake sampled from 2005-2018. Overall, our data suggest that single GSB populations are typically infected by multiple viruses at the same time, that lytic and lysogenic viruses can readily co-infect the same host population in the same ecosystem, and that host strain-level diversity might be an important factor controlling the lytic/lysogeny switch.

Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth’s biomes

Bacteriophages from the Inoviridae family (inoviruses) are characterized by their unique morphology, genome content and infection cycle. To date, a relatively small number of inovirus isolates have been extensively studied. Here, we show that the current 56 members of the Inoviridae family represent a minute fraction of a highly diverse group of inoviruses. Capturing this previously obscured component of the global virosphere may spark new avenues for microbial manipulation approaches and innovative biotechnological applications.

Minimum Information about an Uncultivated Virus Genome (MIUViG)

We present an extension of the Minimum Information about any (x) Sequence (MIxS) standard for reporting sequences of uncultivated virus genomes. Minimum Information about an Uncultivated Virus Genome (MIUViG) standards were developed within the Genomic Standards Consortium framework and include virus origin, genome quality, genome annotation, taxonomic classification, biogeographic distribution and in silico host prediction. Community-wide adoption of MIUViG standards, should enable more robust comparative studies and a systematic exploration of the global virosphere.

Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses

Ocean microbes drive biogeochemical cycling on a global scale. However, this cycling is constrained by viruses that affect community composition, metabolic activity, and evolutionary trajectories. Here we assemble complete genomes and large genomic fragments from both surface- and deep-ocean viruses sampled during the Tara Oceans and Malaspina research expeditions, and analyse the resulting ‘global ocean virome’ dataset to present a global map of abundant, double-stranded DNA viruses complete with genomic and ecological contexts.

Contact