Essentially E. coli: New Study Identifies E. coli Essential Genes
How many genes does it take to build an organism? This answer depends on the organisms being studied, of course, and even in the world of microorganisms, gene numbers can vary widely. Discounting viruses, which rely on the molecular machinery of their host cell during replication, the smallest bacterial genome yet discovered is Craig Venter’s Syn 3.0, a synthetic bacterium based on the bacterium Mycoplasma mycoides whose genome has been engineered to contain only 473 essential genes of the 901 genes encoded on the original chromosome. This type of study is nearly impossible without determining which encoded genes are essential, a question that a recent mBio report attempts to answer for the Escherichia coli genome.
Why E. coli? In addition to being a human commensal (and sometimes pathogen), E. coli has been the workhorse of molecular biology discovery. Experiments using E. coli contributed to some of the foundational discoveries of genetics, from transcription and translation to the discovery that DNA is the molecule of inheritance. The beauty of these discoveries was the ubiquity of their applications: “What is true for E. coli is true for the elephant,” as the famous quote from Jacque Monod goes. Understanding which genes are essential for E. coli may lead to further discoveries about the basic requirements for life. Discrepancies from previous literature motivated lead author Emily Goodall and senior author Ian Henderson to try a new method to identify the E. coli essential genes.
A graph showing most of the nonessential genes with a high frequency of transposon insertion under the blue line, and the few essential genes with low or no frequency of transposon insertion under the red line. Source.
The authors used transposon-directed insertion site sequencing to insert the Tn5 transposon into E. coli K-12 at random over the complete genome. Sequencing where the insertions were found helped determine gene essentiality: genes with very few or no insertions were theorized to confer a very high fitness cost if disrupted. Mapping the frequency of insertions along the x-axis, the scientists observed two groups (see figure, right): one group of genes with frequent insertion (blue line), and a smaller group of genes with very low or infrequent insertion (red line). They counted 3,793 genes in the blue, nonessential group, and 358 genes in the red, essential group, with 162 genes undetermined for essentiality. Comparing the essential gene list here to previous reports found 248 genes commonly identified in all studies.
Is this new genome map the key to finding a minimal genome for E. coli? The authors, in combining their data with previous research on E. coli essential genes, have made great strides toward finding the minimal number of required genes, but many questions remain. The mBio report considers only protein-coding regions, leaving a great deal to learn about essential RNAs other than mRNAs. The authors describe finding a number of insert-free regions within nonessential genes, suggesting that this genomic region within the chromosome may play an important role outside of its protein-coding function. Further, some inserts were found only on one strand of an open reading frame (ORF), leaving open the question of the importance of the opposite strand.
The authors admit there is work to do to eliminate the false-positive and false-negative findings. In addition to comparing their work to that of others, the authors suggest that comparing essential genes under different conditions, such as during growth in liquid medium, will help understand the reported conditional essentiality of some genes found to be nonessential in this report. However, the identification of 248 essential genes between this and previous studies continues to build an understanding of the metabolic, structural, and reproductive processes that underpin basic life.