MIDAS 2 Protocol
miriam.goldman, chunyu.zhao
Abstract
The Metagenomic Intra-Species Diversity Analysis System 2 (MIDAS2) is a scalable pipeline that identifies single nucleotide variants (SNVs) and gene copy number variants (CNVs) in metagenomes using comprehensive reference databases built from public microbial genome collections (“metagenotyping”). MIDAS2 is the first metagenotyping tool with functionality to control metagenomic read mapping filters and to customize the reference database to the microbial community, features that improve the precision and recall of detected variants. Here we present four basic protocols for the most common use cases of MIDAS2, along with two supporting protocols for installation and advanced use. All the steps of metagenotyping, from raw sequencing reads to population genetic analysis, are demonstrated with example data in two downloadable sequencing libraries of single-end metagenomic reads representing a mixture of multiple bacterial species. This set of protocols empowers users to accurately genotype hundreds of species in thousands of samples, providing rich genetic data for studying the evolution and strain-level ecology of microbial communities.