At the DOE Joint Genome Institute, I lead the Viral Genomics group where we explore viruses of microbes and their impacts on ecosystems using (mostly) fancy ‘omics tools. Our current projects include the study of viral diversity and virus:host interactions in soil and freshwater environments, along with the development of new bioinformatics tools and experimental protocols to probe and characterize uncultivated viruses. We also assist users of the JGI Metagenome Program with their analysis, including identification of viral sequences, functional annotation, taxonomic classification, etc.

The long-term goal of my research is to understand the ecological and evolutionary drivers of virus:host dynamics in natural microbial communities. This research involves a mix of experimental and computational approaches spanning from the molecular to the ecosystem scale, trying to address fundamental questions like “how do viruses spread and adapt across environments ?”, “how do viruses take over and reprogram microbial cells ?”, and “how do viral infections alter ecosystem processes ?”.


  • Microbial/Viral Ecology
  • (Viral) Metagenomics
  • Virus Evolution


  • PhD in Microbial Ecology, 2013

    Université Blaise Pascal, Clermont-Ferrand II (now Université Clermont Auvergne)

  • MSc in Data Analysis and Modeling for Life Sciences, 2010

    Université Blaise Pascal, Clermont-Ferrand II (now Université Clermont Auvergne)

  • BSc in Microbiology, 2008

    Université Blaise Pascal, Clermont-Ferrand II (now Université Clermont Auvergne)

Current Projects

Picture of East River hillslope and Green Butte biocrust - Virus-driven alterations of microbial metabolism in soil
We plan to study viral diversity and virus:host interactions in two model systems (East River hillslopes and Green Butte biocrusts) from the single-cell to the ecosystem level. Project goals include (i) characterizing new mechanisms by which viruses transform microbial cells, (ii) investigating how how virus:host interactions are transformed by changes in local environmental conditions, (iii) developing innovative methods to measure viral presence and activity in natural soil systems. 5-year project funded by the DOE Early Career Research Program, in collaboration with the Brodie Lab and the Northen Lab. (Pictures courtesy of Tamy Swenson & Brodie Lab)

Picture of Trout Bog Lake - Virus:host interactions in model freshwater lakes
We study virus:host interactions in Trout Bog Lake based on targeted metagenomics coupled with time-series samples spanning across a decade, to better understand long-term dynamics between key viruses and bacteria in this ecosystem. In collaboration with the McMahon Lab. Now published !! (10.1038/s41396-020-00870-1) (Image courtesy of Trina McMahon)

Illustration of viral capsids - IMG/VR - Large-scale exploration of uncultivated viral diversity
We routinely mine public genomes, metagenomes, and metatranscriptomes for new viral sequences to progressively build a large and comprehensive genomic catalog of the virosphere. See also the tools section (Viral capsids drawing from Leah Pantea /

iVirus logo - iVirus - Towards a user-friendly viral ecogenomics toolkit
Developing tools to identify, clean, compare, and annotate uncultivated viral genomes (mostly) assembled from metagenomes. In collaboration with the Sullivan Lab and Wrighton Lab. See also the tools section


Teaching and other Online Resources

  • Viromics workshop The (somewhat) annual workshop dedicated to viromics analysis including viral genome assembly, identification, annotation, curation, and taxonomic classification. Hosted at Ohio State University -
  • MGM Workshop Bi-annual highly hands-on workshop designed to familiarize users with the Integrated Microbial Genomes & Microbiomes (IMG/M) data and workflows for computational analysis and interpretation of sequence data, including IMG/VR.
  • VERVE Net Collection of news, protocols, and online discussion for viral ecologists -

Selected media coverage

1 / 5
2 / 5
3 / 5
4 / 5
5 / 5

JGI Podcast - Genome Insider
Nature News - Amy Maxmen - 19 March 2018
Wired - Shara Tonn - 03 Sept 2015
Nature Biotechnology - Charles Schmidt - 01 Oct 2018
Comminucations of the ACM - Chris Edwards - Dec 2018

Recent Publications

Global overview and major challenges of host prediction methods for uncultivated phages

Bacterial communities play critical roles across all of Earth’s biomes, affecting human health and global ecosystem functioning. They do so under strong constraints exerted by viruses, that is, bacteriophages or ‘phages’. Phages can reshape bacterial communities’ structure, influence long-term evolution of bacterial populations, and alter host cell metabolism during infection. Metagenomics approaches, that is, shotgun sequencing of environmental DNA or RNA, recently enabled large-scale exploration of phage genomic diversity, yielding several millions of phage genomes now to be further analyzed and characterized. One major challenge however is the lack of direct host information for these phages. Several methods and tools have been proposed to bioinformatically predict the potential host(s) of uncultivated phages based only on genome sequence information. Here we review these different approaches and highlight their distinct strengths and limitations. We also outline complementary experimental assays which are being proposed to validate and refine these bioinformatic predictions.

Ecology and molecular targets of hypermutation in the global microbiome

Changes in the sequence of an organism’s genome, i.e., mutations, are the raw material of evolution. The frequency and location of mutations can be constrained by specific molecular mechanisms, such as diversity-generating retroelements (DGRs). DGRs have been characterized from cultivated bacteria and bacteriophages, and perform error-prone reverse transcription leading to mutations being introduced in specific target genes. DGR loci were also identified in several metagenomes, but the ecological roles and evolutionary drivers of these DGRs remain poorly understood. Here, we analyze a dataset of >30,000 DGRs from public metagenomes, establish six major lineages of DGRs including three primarily encoded by phages and seemingly used to diversify host attachment proteins, and demonstrate that DGRs are broadly active and responsible for >10% of all amino acid changes in some organisms. Overall, these results highlight the constraints under which DGRs evolve, and elucidate several distinct roles these elements play in natural communities.

Extreme dimensions — how big (or small) can tailed phages be?

This May 2021 Genome Watch highlights the search for unusually large (or small) tailed phages driven by metagenomics.

Host population diversity as a driver of viral infection cycle in wild populations of green sulfur bacteria with long standing virus-host interactions

Viral infections of bacterial hosts range from highly lytic to lysogenic, where highly lytic viruses undergo viral replication and immediately lyse their hosts, and lysogenic viruses have a latency period before replication and host lysis. While both types of infections are routinely observed in the environment, the ecological and evolutionary processes that regulate these different viral dynamics are still not well understood. In this study, we identify and characterize the long-term dynamics of uncultivated viruses infecting green sulfur bacteria (GSB) in a model freshwater lake sampled from 2005-2018. Overall, our data suggest that single GSB populations are typically infected by multiple viruses at the same time, that lytic and lysogenic viruses can readily co-infect the same host population in the same ecosystem, and that host strain-level diversity might be an important factor controlling the lytic/lysogeny switch.

Giant virus diversity and host interactions through global metagenomics

Our current knowledge about nucleocytoplasmic large DNA viruses (NCLDVs) is largely derived from viral isolates that are co-cultivated with protists and algae. Here we reconstructed 2,074 NCLDV genomes from sampling sites across the globe by building on the rapidly increasing amount of publicly available metagenome data, leading to an 11-fold increase in phylogenetic diversity and a parallel 10-fold expansion in functional diversity. We anticipate that the global diversity of NCLDVs that we describe here will establish giant viruses—which are associated with most major eukaryotic lineages—as important players in ecosystems across Earth’s biomes.

Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth’s biomes

Bacteriophages from the Inoviridae family (inoviruses) are characterized by their unique morphology, genome content and infection cycle. To date, a relatively small number of inovirus isolates have been extensively studied. Here, we show that the current 56 members of the Inoviridae family represent a minute fraction of a highly diverse group of inoviruses. Capturing this previously obscured component of the global virosphere may spark new avenues for microbial manipulation approaches and innovative biotechnological applications.

Minimum Information about an Uncultivated Virus Genome (MIUViG)

We present an extension of the Minimum Information about any (x) Sequence (MIxS) standard for reporting sequences of uncultivated virus genomes. Minimum Information about an Uncultivated Virus Genome (MIUViG) standards were developed within the Genomic Standards Consortium framework and include virus origin, genome quality, genome annotation, taxonomic classification, biogeographic distribution and in silico host prediction. Community-wide adoption of MIUViG standards, should enable more robust comparative studies and a systematic exploration of the global virosphere.

Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses

Ocean microbes drive biogeochemical cycling on a global scale. However, this cycling is constrained by viruses that affect community composition, metabolic activity, and evolutionary trajectories. Here we assemble complete genomes and large genomic fragments from both surface- and deep-ocean viruses sampled during the Tara Oceans and Malaspina research expeditions, and analyse the resulting ‘global ocean virome’ dataset to present a global map of abundant, double-stranded DNA viruses complete with genomic and ecological contexts.