Metagenomic Next Generation Sequencing: How Does It Work and Is It Coming to Your Clinical Microbiology Lab?

Nov. 4, 2019

Next generation sequencing (NGS) methods started to appear in the literature in the mid-2000s and had a transformative effect on our understanding of microbial genomics and infectious diseases. There is nonetheless considerable controversy on how, when, and where next generation sequencing will play a role in the clinical diagnostic laboratory.  A deep dive point-counterpoint discussion from the Journal of Clinical Microbiology discusses the challenges and opportunities that may come with introduction of metagenomic next generation sequencing (mNGS) into routine laboratories. What exactly is mNGS and how is it different from the many other nucleic acid technologies out there?

What is Metagenomic Next Generation Sequencing?

Next generation sequencing is any of several high-throughput sequencing methods whereby billions of nucleic acid fragments can be simultaneously and independently sequenced. Contrast this technique to classical methods such as Sanger sequencing (also known as dideoxynucleotide chain termination sequencing), which processes one nucleotide sequence per reaction. 

To characterize a bacterial genome using NGS, for example, the genome is split into multiple fragments that produce sequences or reads ranging from hundreds to tens of thousands of bases in length.  The sequences are assembled into a single genome using computational approaches. Several overlapping sequence reads are pieced together to produce a single longer sequence called a contig. There are often gaps between contigs and although high-fidelity longer sequence reads would be the ideal method of sequencing, platforms that produce shorter reads are generally less costly and the overlap in sequences makes them more accurate. The constructed genome (likely containing gaps) is aligned to a reference database for identification of the organism. This technology represents a substantial advance over the early days of sequencing when a single bacterial genome project could take several years. 

Metagenomic NGS (mNGS) is simply running all nucleic acids in a sample, which may contain mixed populations of microorganisms, and assigning these to their reference genomes to understand which microbes are present and in what proportions. The ability to sequence and identify nucleic acids from multiple different taxa for metagenomic analysis makes this a powerful new platform that can simultaneously identify genetic material from entirely different kingdoms of organisms.
The possible clinical applications are tremendous, including diagnosis of infectious diseases, outbreak tracking, infection control surveillance, and mutation and pathogen discovery, among many others. mNGS, sometimes called shotgun sequencing, of clinical samples has been applied to various sample types including cerebrospinal fluid, blood, respiratory samples, gastrointestinal fluid, and ocular fluid.  
Workflow for metagenomic next-generation sequencing. (1) Genomic DNA is extracted and fragmented. (2) Adaptors are attached for barcoding and library sequencing preparation. (3) The fragments of DNA are simultaneously and independently sequenced. (4) Human-related DNA sequence reads are removed. (5) Contigs of long DNA stretches are assembled from shorter, overlapping sequences. These contigs are aligned to a reference database for taxonomic classification.
Source: Courtesy Rose Lee, generated on

What are the Benefits of Metagenomic Next Generation Sequencing?

The largest strength of mNGS is that it is an unbiased hypothesis-free diagnostic method, unlike targeted polymerase chain reaction (PCR) methods that rely on primers for identification of specific targets to be amplified and detected. Even universal or broad-range PCR methods are not sufficiently broad to be considered metagenomic, as they use specific primers of conserved 16S ribosomal RNA (rRNA) gene and internal transcribed spacer (ITS) sequences to amplify distinctive nucleic acid sequences that can be bioinformatically classified into bacteria/archaea, or fungi respectively. 

Universal primers also pose a problem when diagnosing polymicrobial infections with molecular tests. If polymicrobial populations are present when using 16S sequencing, multiple base-calls will be made per nucleotide, producing a mixed nucleotide chromatogram that cannot be interpreted. While there are de-convolutional computational methods available to predict organisms identified, these are not in standard use for many laboratories, which often reflex to next-generation sequencing of the 16S gene for polymicrobial samples.  

What are the Challenges of Metagenomic Next Generation Sequencing?

Despite the potential of mNGS, there are many barriers to clear before the technology can become part of the mainstream laboratory, as well as gaps in our understanding about its diagnostic utility. Major reservations include the interpretation of findings (distinguishing contamination and colonization from true pathogens), selection and validation of databases used for analyses, and prediction (or lack thereof) of antimicrobial susceptibilities. A common perception is that mNGS is so incredibly sensitive that it will reveal a diagnosis when all other testing is negative. While mNGS may be analytically more sensitive than standard culturing methods in some cases, the necessary removal of vast amounts of human nucleic acid during sequencing preparation and (by computational methods) during the post-analytic process, can decrease the sensitivity in comparison to targeted PCR approaches for many organisms. 

The specificity of mNGS remains the proverbial elephant in the room. Contamination of samples during specimen collection is a large concern given the increased analytical sensitivity of mNGS in comparison to standard culture methods, and there needs to be a validated quality-control process in place for steps from assessing reagent purity to measuring adequate genome coverage controls. Furthermore, with some Illumina platforms, the wrong barcode indices can be designated, leading to false positives on sequencing data. Bioinformatic quality controls are needed to ensure that high quality and validated genomes are available with minimal database errors and there would ideally be bioinformatic personnel available to interpret sequencing results for each test, which is not available at most clinical microbiological laboratories. The Federal Drug Administration (FDA) has collaborated with other federal agencies to curate a database entitled FDA-ARGOS (FDA-database for regulatory-grade microbial sequences), which has been useful to ensure that current mNGS results are reliable and accurate, but these resources need to be updated and maintained. 

The greater question remains surrounding the clinical specificity of mNGS: Are the detected sequences from pathogens that are contributing to the patient’s current disease? The analytical specificity of the mNGS testing can be addressed with rigorous controls throughout specimen collection, sequencing library preparation, assay run, and bioinformatic classification, but clinical specificity is not directly addressed by these approaches. Questions that can help determine clinical utility and applicability include: How can we distinguish  organisms related to transient bacteremia from oral/gastrointestinal flora or skin colonizers in blood/plasma mNGS testing? How should sequencing depth be reported and how reliable is the relationship of sequence depth to true infection? Does this relationship differ by pathogen/host? How long is the expected detectable half-life of a pathogen by mNGS once the patient is receiving appropriate curative therapy? Studies on clinical utility and cost-effectiveness are greatly needed despite the indisputable power of this technology from a research and discovery perspective.  

It’s also worth pointing out that there are no currently FDA-cleared or approved mNGS tests that can be sent for microbial testing, although there are laboratories certified under the Clinical Laboratory Improvement Amendments of 1988 (CLIA ’88) which offer testing on clinical samples. To date, only a few diagnostic NGS systems have been cleared by the FDA for oncological testing or detection of cystic fibrosis, for example. A recent review describes in detail many of the regulatory hurdles and considerations that will need to be addressed before mNGS could enter mainstream clinical diagnostic laboratories as an FDA-validated test.   

In summary, while mNGS testing may likely play a major role in the microbiological diagnostic workflow in the future, particularly as sequencing and bioinformatic processing power evolves, this remains a high-complexity technology for which the clinical utility in our current medical practice environment remains uncertain. Although mNGS testing may offer novel and exciting diagnostic clinical opportunities in the near future, none of it will likely replace an astute clinician anytime soon. 

The above represent the views of the author and does not necessarily reflect the opinion of the American Society for Microbiology.

Author: Rose Lee

Rose Lee
Rose Lee, MD, is an infectious diseases and microbiology fellow at Beth Israel Deaconess Medical Center and Boston Children's Hospital.  She received her MD from Northwestern Feinberg School of Medicine and is a postdoctoral fellow in the laboratory of James Collins working on novel diagnostic technologies for infectious diseases.