Global genomic analysis of Bacillus and related genera

  • Lijie Song*
  • *Corresponding author for this work

Research output: Book/ReportPh.D. thesis

97 Downloads (Orbit)

Abstract

The order Bacillales represents a phylogenetically diverse group of soil-dwelling microorganisms that have attracted substantial scientific attention due to their remarkable ecological adaptability and extensive biotechnological potential across agriculture, biotechnology and medicine. This taxonomic group encompasses several clinically and industrially significant genera, most notably Bacillus and Paenibacillus, which are particularly recognized for their capacity to produce a variety of bioactive compounds, including antibiotics, enzymes, and various secondary metabolites with substantial applications. The rapid development of genomic technologies and bioinformatic tools has revolutionized microbial research, enabling unprecedented insights into the phylogenetic classification and metabolic potential of Bacillales species. Extensive global collaboration and technological advances have provided the basis for the discovery of new species and the generation of big data on genomes and metabolomes, and have driven scientists to accelerate extensive and targeted research on species of order Bacillales with strong potential for application.

In this PhD project, the primary objective was to expand the genomic data available for Bacillus and related genera, particularly on biosynthetic gene clusters of secondary metabolites (smBGCs). By generating 121 high-quality genomes of Bacillales species collected from various locations globally, we contribute to the global effort to characterize microbial diversity and the biosynthetic potential of Bacillus and its relatives. With this in-house dataset, we identified 1,176 biosynthetic gene clusters (BGCs), thereby demonstrating the rich secondary metabolite synthesis capabilities of these species, more importantly, many of these BGCs were classified as unknown, suggesting that the synthesis potential of more novel compounds has yet to be tapped, and highlighting the importance of continued genomic exploration in this field.

In addition, we noted limitations in the classification of species, particularly in Bacillus, where traditional 16S rRNA-based methods are not effective in distinguishing species at fine resolution. To address this shortcoming, we developed novel primers targeting the tuf housekeeping gene, which were shown to allow for more accurate species-level identification and quantification in environmental samples, particularly soil samples.

Moreover, the scope of this project was expanded to include Paenibacillus, a genus that has received less attention in terms of BGC mining, despite its potential as a source of secondary
metabolites. Through comprehensive analysis of publicly available genome data, we uncovered a substantial number of BGCs in Paenibacillus species, with an overwhelming majority (83%) representing novel gene clusters that are not currently represented in reference databases. This revealed the richness and untapped potential of this genus for the discovery of novel bioactive compounds. Moreover, we identified clade-specific distribution patterns of gene cluster family (GCFs) and observed a significant degree of genetic divergence within the genus, which could guide future studies focused on exploring unique biosynthetic pathways in these species.

In conclusion, the work presented in this thesis contributes to the growing genomic resources available for Bacillales species, demonstrates the application of whole genome sequencing and large-scale genomic mining in the study of the biosynthetic potential of secondary metabolites, and introduces a newly designed primer and system for the classification of Bacillus species. The results highlight the rich and novel secondary metabolite synthesis potential of species in the Bacillus and Paenibacillus genera, particularly highlighting the distribution patterns in different species, thus providing a foundation for exploring future applications in medicine, agriculture, and industry.
Original languageEnglish
Place of PublicationKgs. Lyngby, Denmark
PublisherDTU Bioengineering
Number of pages200
Publication statusPublished - 2025

Fingerprint

Dive into the research topics of 'Global genomic analysis of Bacillus and related genera'. Together they form a unique fingerprint.

Cite this