Finding a bounty of new bioactive compounds in microbial genome data. Genome Mining with antiSMASH

Activity: Talks and presentationsConference presentations

Description

Secondary metabolites produced by microorganisms are the main source of bioactive compounds that are in use as antimicrobial and anticancer drugs, fungicides, herbicides and pesticides. In the last decade, the increasing availability of microbial genomes has established genome mining as a very important method for the identification of their biosynthetic gene clusters (BGCs). Since 2010 antiSMASH (found at https://antismash.secondarymetabolites.org/) has gained significant traction in the community: the publicly available website has processed > 350,000 jobs, and the antiSMASH publications in NAR have been cited > 2000 times. Different versions of antiSMASH are optimised for bacterial, fungal, and plant genomes. antiSMASH was limited to de novo computing results for user-submitted genomes and only partially connects these with BGCs from other organisms. We developed the antiSMASH database, available at https://antismash-db.secondarymetabolites.org/ as a new resource to browse antiSMASH-annotated BGCs in many different organisms and perform advanced search queries for specific genes or enzymes with a wider scope than existing databases that either focus on a limited set of BGCs, organisms, or annotations. Moreover, users benefit from a rich set of contextual data, because of the tight integration with the Minimum Information about a Biosynthetic Gene cluster (MIBiG) repository. The antiSMASH database contains pre-calculated antiSMASH results for all publicly available microbial genomes from the NCBI GenBank database that have an assembly status of "complete" and existing gene calls (currently 3,907 bacterial genomes). At regular time intervals, all entries are re-generated with the newest version of antiSMASH. The Minimum Information about a Biosynthetic Gene cluster (MIBiG) specification is a community standard for the annotation of BGCs and their products. The associated database at https://mibig.secondarymetabolites.org/ contains ~ 1300 manually curated annotations of BGCs following the MIBiG spec. The integration of this dataset in antiSMASH allows for an easy dereplication of predicted gene clusters against known gene clusters. Together antiSMASH, the antiSMASH database, and the MIBiG database provide a solid analysis platform for the genome mining of bacteria, fungi, and plants.
Period21 Mar 2018
Event title12th CeBiTec Symposium: Big Data in Medicine and Biotechnology
Event typeConference
Conference number12
LocationBielefeld, Germany, North Rhine-Westphalia
Degree of RecognitionInternational