Abstract
Synthetic biology offers revolutionary possibilities for addressing critical global challenges within healthcare, sustainability, and environmental remediation by enabling the precise engineering of living organisms and systems. However, the increasing complexity of genetic designs and the iterative nature of synthetic biology’s design–build–test–learn (DBTL) cycle often result in inefficiencies, experimental bottlenecks, and reproducibility issues. This thesis explores the integration of literate programming, a paradigm combining human-readable documentation and executable code, into synthetic biology workflows, aiming to accelerate, automate, and standardize the DBTL cycle.
The thesis centers around three core contributions. First, in Paper no. 1, I introduce teemi, an open-source, Python-based platform leveraging literate programming to unify and streamline all phases of the DBTL cycle in connection with cell factory engineering. I demonstrate teemi's capabilities by simulating combinatorial genetic library assembly, automating selection and integration of genetic parts, reducing experimental errors through precise in silico protocols, and employing automated machine learning for predictive genotype-phenotype modeling. These methodologies significantly accelerate iterative strain optimization, as exemplified by improving strictosidine production in yeast.
Second, in Paper no. 2, I introduce StreptoCAD, a comprehensive software toolbox that automates genome engineering workflows, specifically tailored for Streptomyces species. StreptoCAD automates the design of complex genome-editing strategies, including CRISPR-based methods (Cas9, Cas3, CRISPRi, and base-editing) and gene overexpression libraries. I validate StreptoCAD by rapidly generating engineered Streptomyces strains with significantly reduced design time, increased reproducibility, and minimized human error. Its intuitive graphical interface and standardized outputs democratize advanced genome engineering techniques, promoting broader adoption across research communities.
Thirdly, in Section 4 (manuscript in preparation), I apply literate programming to genetic engineering in filamentous fungi, developing a platform for high-throughput signal peptide design and evaluation in Aspergillus oryzae. The platform is used to de novo generate, prioritize, build, and validate novel SP sequences within a DBTL framework. Through automated workflows employing laboratory robotics, a signal peptide library is generated with diverse sequence identities and increased secretion capabilities. This approach demonstrates how machine learning, literate programming workflows, and laboratory automation can accelerate fungal strain engineering while increasing protein secretion for enzyme production.
Collectively, these contributions demonstrate how literate programming has the potential to transform synthetic biology workflows into standardized, reproducible, and scalable processes. By embedding computational simulations, robotic automation, and machine learning directly within experimental protocols, literate programming facilitates rapid iteration, reduces bottlenecks, and promotes transparent sharing of data and methods. Future perspectives include leveraging artificial intelligence to automate literate protocol generation, establishing standardized genetic design formats to enhance interoperability, and applying these tools across diverse biological systems. Ultimately, this interdisciplinary approach not only accelerates synthetic biology but also lays the groundwork for a collaborative ecosystem capable of addressing increasingly complex biological challenges.
The thesis centers around three core contributions. First, in Paper no. 1, I introduce teemi, an open-source, Python-based platform leveraging literate programming to unify and streamline all phases of the DBTL cycle in connection with cell factory engineering. I demonstrate teemi's capabilities by simulating combinatorial genetic library assembly, automating selection and integration of genetic parts, reducing experimental errors through precise in silico protocols, and employing automated machine learning for predictive genotype-phenotype modeling. These methodologies significantly accelerate iterative strain optimization, as exemplified by improving strictosidine production in yeast.
Second, in Paper no. 2, I introduce StreptoCAD, a comprehensive software toolbox that automates genome engineering workflows, specifically tailored for Streptomyces species. StreptoCAD automates the design of complex genome-editing strategies, including CRISPR-based methods (Cas9, Cas3, CRISPRi, and base-editing) and gene overexpression libraries. I validate StreptoCAD by rapidly generating engineered Streptomyces strains with significantly reduced design time, increased reproducibility, and minimized human error. Its intuitive graphical interface and standardized outputs democratize advanced genome engineering techniques, promoting broader adoption across research communities.
Thirdly, in Section 4 (manuscript in preparation), I apply literate programming to genetic engineering in filamentous fungi, developing a platform for high-throughput signal peptide design and evaluation in Aspergillus oryzae. The platform is used to de novo generate, prioritize, build, and validate novel SP sequences within a DBTL framework. Through automated workflows employing laboratory robotics, a signal peptide library is generated with diverse sequence identities and increased secretion capabilities. This approach demonstrates how machine learning, literate programming workflows, and laboratory automation can accelerate fungal strain engineering while increasing protein secretion for enzyme production.
Collectively, these contributions demonstrate how literate programming has the potential to transform synthetic biology workflows into standardized, reproducible, and scalable processes. By embedding computational simulations, robotic automation, and machine learning directly within experimental protocols, literate programming facilitates rapid iteration, reduces bottlenecks, and promotes transparent sharing of data and methods. Future perspectives include leveraging artificial intelligence to automate literate protocol generation, establishing standardized genetic design formats to enhance interoperability, and applying these tools across diverse biological systems. Ultimately, this interdisciplinary approach not only accelerates synthetic biology but also lays the groundwork for a collaborative ecosystem capable of addressing increasingly complex biological challenges.
| Original language | English |
|---|
| Place of Publication | Kgs. Lyngby, Denmark |
|---|---|
| Publisher | DTU Bioengineering |
| Number of pages | 193 |
| Publication status | Published - 2025 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Fingerprint
Dive into the research topics of 'Literate programming of synthetic biology'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Literate Programming of Biology
Levassor, L. (PhD Student), Frandsen, R. J. N. (Main Supervisor), Madsen, J. (Supervisor), Weber, T. (Supervisor), Carbonell, P. (Examiner) & Lübeck, M. (Examiner)
01/04/2022 → 05/11/2025
Project: PhD
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver