CBCB Researchers Develop Tool that Makes Reconstructing Microbial Genomes Easier

Apr 30, 2021

In the seafaring world, a binnacle is a wooden stand placed near the ship’s helm that holds important tools and instruments needed to navigate from one point to the next.

At the University of Maryland, researchers in the Center for Bioinformatics and Computational Biology (CBCB) have developed their own tool—appropriately called Binnacle—that can help scientists navigate the complex world of microbial genomes.

This open-source software was described in a paper, “Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins,” that was recently published in the online journal Frontiers in Microbiology.

The paper was written by Harihara Subrahmaniam Muralidharan (lead author), a third-year computer science doctoral student; Nidhi Shah (co-lead author) who just defended her doctoral dissertation; Jacquelyn Meisel, an assistant research scientist in the University of Maryland Institute for Advanced Computer Studies (UMIACS); and Mihai Pop, a professor of computer science and the director of UMIACS.

The CBCB team begins with the premise that recent advances in high-throughput sequencing strategies have spurred microbiome research and revealed important insights into the microbial communities that inhabit human, animal and environmental habitats.

In particular, whole metagenomic shotgun sequencing—which allows for a comprehensive analysis of microbial DNA from a small-sized sample—has been instrumental in expanding an understanding of the functional potential and genetic composition of different microorganisms that have not been previously cultured.

The challenge, though, is that reconstructing complete genomes of organisms from whole metagenomic shotgun sequencing data is often difficult and time-consuming.

Recovered genomes are often highly fragmented, the researchers say, due to uneven abundances of organisms, repeats within and across genomes, sequencing errors, and strain-level variation. To address the fragmented nature of metagenomic assemblies, scientists rely on a process called binning, which clusters together contigs—a set of overlapping DNA segments—inferred to originate from the same organism.

Existing binning algorithms use oligonucleotide frequencies and contig abundance (coverage) within and across samples to group together contigs from the same organism. However, these algorithms often miss short contigs and contigs from regions with unusual coverage or DNA composition characteristics.

The CBCB researchers propose that information from assembly graphs—used to represent the final assembly of a genome or metagenomes—can assist current strategies for metagenomic binning. They use a metagenomic scaffolding tool, called MetaCarvel, to construct assembly graphs where contigs are nodes and edges are inferred based on paired-end reads.

The Binnacle software is then able to extract information from the assembly graphs and subsequently cluster scaffolds into comprehensive bins.

The CBCB researchers show that binning graph-based scaffolds, rather than contigs, improves the contiguity and quality of the resulting bins, and captures a broader set of the genes of the organisms being reconstructed.

The authors believe that their Binnacle software represents a first step toward the development of effective metagenomic analysis tools that can leverage all the information contained in one or more samples. Ultimately, this could lead to the automated reconstruction of a metagenome-assembled genome, opening new pathways for accurate and efficient discoveries in public-health microbiology and other fields.

The research described in the published paper is supported by grants from the National Institutes of Health and the National Science Foundation.

—Story by Melissa Brachfeld