CaltechTHESIS
  A Caltech Library Service

Studies on Scaling Throughput in Protein Engineering

Citation

Schaus, Lucas Jean Nicolas (2025) Studies on Scaling Throughput in Protein Engineering. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/jqng-x012. https://resolver.caltech.edu/CaltechTHESIS:08022024-005547280

Abstract

In this work we present three studies in protein engineering. While all three protein classes that have been targeted for engineering tasks are very different, the studies have a focus on scaling-up the throughput in protein engineering.

The first study concerns machine learning (ML) based antibody humanization techniques. Achieving a reduction of patient anti-drug antibody responses in clinical trials is the goal of antibody humanization. To measure this however, one needs to pass significant scientific, bureaucratic, and financial hurdles, which is very rarely done and especially never at scale. Most existing ML-based antibody humanization techniques claim that they work without providing any experimental evidence. We developed Mousify as an in silico antibody humanization platform to place existing models into one framework for wet-laboratory validation. We demonstrate that even the best models have a fundamental flaw in that they only generate a single antibody. We use Mousify and Markov chains to show that using ML-based antibody humanization models for library generation is not only feasible but produces both stable and functional variants. Learning the lessons from our wet-laboratory experiments, we then developed a variational autoencoder model with properties that hopefully improve the outcomes of antibody humanization experiments.

In the second study, we outline our plans and initial results to develop a bioelectrocatalytic system for the conversion of N2 to ammonia using nitrogenase. Most of the world’s ammonia is used for agricultural purposes and is produced via the environmentally damaging Haber-Bosch process. Engineering nitrogenase for the bioelectrocatalytic production of ammonia is not trivial and a high throughput is not guaranteed. We present preliminary results in how throughput can be increased through diazotrophic pre-selection of nitrogenase variants, as well as a quest to find the ideal starting point for engineering using a combination of ancestral sequence reconstruction and generative protein language models.

In the third and final study we present a directed evolution campaign to evolve protoglobins for the enantioselective catalytic formation of cis-trifluoromethyl substituted cyclopropanes, the first such reaction in both the chemical and biological world. Not only is the enzyme ApePgb LQ capable of efficiently performing carbene insertions into double-bonds, but it also shows a much more diverse substrate scope than similar enantioselective formations of trans-trifluoromethyl substituted cyclopropanes. After demonstrating that ApePgb LQ reactions can be increased to a 1-mmol scale, we investigated the nature of protoglobin cis-selectivity using various computational methods.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:Biochemistry, Protein Engineering, Machine Learning, Antibody Engineering, Nitrogenase, Protoglobins
Degree Grantor:California Institute of Technology
Division:Chemistry and Chemical Engineering
Major Option:Biochemistry and Molecular Biophysics
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Mayo, Stephen
Thesis Committee:
  • Rees, Douglas C. (chair)
  • Bjorkman, Pamela J.
  • Thomson, Matthew
  • Mayo, Stephen L.
Defense Date:2 July 2024
Funders:
Funding AgencyGrant Number
Fond National de la Recherche AFR PhD GrantUNSPECIFIED
Resnick Sustainability Institute Impact GrantUNSPECIFIED
Charlie TrimbleUNSPECIFIED
Record Number:CaltechTHESIS:08022024-005547280
Persistent URL:https://resolver.caltech.edu/CaltechTHESIS:08022024-005547280
DOI:10.7907/jqng-x012
Related URLs:
URLURL TypeDescription
https://doi.org/10.1002/ange.202208936DOIArticle adapted for chapter 4
ORCID:
AuthorORCID
Schaus, Lucas Jean Nicolas0000-0002-6094-7402
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:16607
Collection:CaltechTHESIS
Deposited By: Lucas Schaus
Deposited On:13 Sep 2024 17:47
Last Modified:20 Sep 2024 20:11

Thesis Files

[img] PDF - Final Version
See Usage Policy.

13MB

Repository Staff Only: item control page