Dynamic cellular responses to environmental constraints are coordinated by the transcriptional regulatory network (TRN), which modulates gene expression. This network controls most fundamental cellular responses, including metabolism, motility, and stress responses. Here, we apply independent component analysis, an unsupervised machine learning approach, to 95 high-quality Sulfolobus acidocaldarius RNA-seq datasets and extract 45 independently modulated gene sets, or iModulons. Together, these iModulons contain 755 genes (32% of the genes identified on the genome) and explain over 70% of the variance in the expression compendium. We show that five modules represent the effects of known transcriptional regulators, and hypothesize that most of the remaining modules represent the effects of uncharacterized regulators. Further analysis of these gene sets results in: (1) the prediction of a DNA export system composed of five uncharacterized genes, (2) expansion of the LysM regulon, and (3) evidence for an as-yet-undiscovered global regulon. Our approach allows for a mechanistic, systems-level elucidation of an extremophile’s responses to biological perturbations, which could inform research on gene-regulator interactions and facilitate regulator discovery in S. acidocaldarius. We also provide the first global TRN for S. acidocaldarius. Collectively, these results provide a roadmap toward regulatory network discovery in archaea.
Bibliographical noteFunding Information:
The authors would like to acknowledge Amitesh Anand, Sonja-Verena Albers, and Marleen van Wolferen for useful discussions.
BP gratefully acknowledges the support of the Y.C. Fung Endowed Chair in Bioengineering at University of California, San Diego.
Copyright © 2021 Chauhan, Poudel, Rychel, Lamoureux, Yoo, Al Bulushi, Yuan, Palsson and Sastry.
- Independent component analysis (ICA)
- Machine learning
- Regulon discovery
- Systems biology
- Transcriptional regulation
- Transcriptional regulatory network (TRN)