TY - JOUR
T1 - A multi-scale expression and regulation knowledge base for Escherichia coli
AU - Lamoureux, Cameron R.
AU - Decker, Katherine T.
AU - Sastry, Anand V.
AU - Rychel, Kevin
AU - Gao, Ye
AU - McConn, John Luke
AU - Zielinski, Daniel C.
AU - Palsson, Bernhard O.
N1 - Publisher Copyright:
© 2023 The Author(s). Published by Oxford University Press on behalf of Nucleic Acids Research.
PY - 2023
Y1 - 2023
N2 - Transcriptomic data is accumulating rapidly; thus, scalable methods for
extracting knowledge from this data are critical. Here, we assembled a
top-down expression and regulation knowledge base for Escherichia coli.
The expression component is a 1035-sample, high-quality RNA-seq
compendium consisting of data generated in our lab using a single
experimental protocol. The compendium contains diverse growth
conditions, including: 9 media; 39 supplements, including antibiotics;
42 heterologous proteins; and 76 gene knockouts. Using this resource, we
elucidated global expression patterns. We used machine learning to
extract 201 modules that account for 86% of known regulatory
interactions, creating the regulatory component. With these modules, we
identified two novel regulons and quantified systems-level regulatory
responses. We also integrated 1675 curated, publicly-available
transcriptomes into the resource. We demonstrated workflows for
analyzing new data against this knowledge base via deconstruction of
regulation during aerobic transition. This resource illuminates the E. coli transcriptome at scale and provides a blueprint for top-down transcriptomic analysis of non-model organisms.
AB - Transcriptomic data is accumulating rapidly; thus, scalable methods for
extracting knowledge from this data are critical. Here, we assembled a
top-down expression and regulation knowledge base for Escherichia coli.
The expression component is a 1035-sample, high-quality RNA-seq
compendium consisting of data generated in our lab using a single
experimental protocol. The compendium contains diverse growth
conditions, including: 9 media; 39 supplements, including antibiotics;
42 heterologous proteins; and 76 gene knockouts. Using this resource, we
elucidated global expression patterns. We used machine learning to
extract 201 modules that account for 86% of known regulatory
interactions, creating the regulatory component. With these modules, we
identified two novel regulons and quantified systems-level regulatory
responses. We also integrated 1675 curated, publicly-available
transcriptomes into the resource. We demonstrated workflows for
analyzing new data against this knowledge base via deconstruction of
regulation during aerobic transition. This resource illuminates the E. coli transcriptome at scale and provides a blueprint for top-down transcriptomic analysis of non-model organisms.
U2 - 10.1093/nar/gkad750
DO - 10.1093/nar/gkad750
M3 - Journal article
C2 - 37713610
AN - SCOPUS:85175406026
SN - 0305-1048
VL - 51
SP - 10176
EP - 10193
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 19
M1 - gkad750
ER -