CaltechTHESIS
  A Caltech Library Service

Statistical Models of the Protein Fitness Landscape: Applications to Protein Evolution and Engineering

Citation

Romero, Philip Anthony (2012) Statistical Models of the Protein Fitness Landscape: Applications to Protein Evolution and Engineering. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/7W9R-Y338. https://resolver.caltech.edu/CaltechTHESIS:03172012-160452929

Abstract

Understanding the protein fitness landscape is important for describing how natural proteins evolve and for engineering new proteins with useful properties. This mapping from protein sequence to protein function involves an extraordinarily complex balance of numerous physical interactions, many of which are still not well understood. Directed evolution circumvents our ignorance of how a protein’s sequence encodes its function by using iterative rounds of random mutation and artificial selection. The selection criteria is based on experimental measurements, which permits the optimization of protein sequence properties that are not understood. While directed evolution has been useful for exploring protein fitness landscapes, these searches have been relatively local in comparison to the vast space of possible protein sequences. Here, we present several classes of statistical models that map protein sequence space on a larger scale. We use these simple models to interpret data from SCHEMA recombination libraries, understand the evolutionary benefit of intragenic recombination, and design optimized protein sequences. By training on directly on experimental data, these models implicitly capture the numerous and possibly unknown factors that shape the protein fitness landscape. This provides an unrivaled quantitative accuracy across a massive number of protein sequences.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:Statistical models, protein evolution, protein engineering, directed evolution, homologous recombination, SCHEMA
Degree Grantor:California Institute of Technology
Division:Chemistry and Chemical Engineering
Major Option:Biochemistry and Molecular Biophysics
Awards:Demetriades-Tsafka-Kokkalis Prize in Biotechnology or Related Fields, 2012
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Arnold, Frances Hamilton
Thesis Committee:
  • Mayo, Stephen L. (chair)
  • Rees, Douglas C.
  • Wang, Zhen-Gang
  • Arnold, Frances Hamilton
Defense Date:16 December 2011
Record Number:CaltechTHESIS:03172012-160452929
Persistent URL:https://resolver.caltech.edu/CaltechTHESIS:03172012-160452929
DOI:10.7907/7W9R-Y338
Related URLs:
URLURL TypeDescription
https://doi.org/10.1038/nrm2805DOIArticle adapted for Chapter 1.
https://doi.org/10.1038/nbt.1609DOIArticle adapted for Chapter 2.
ORCID:
AuthorORCID
Romero, Philip Anthony0000-0002-2586-7263
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:6852
Collection:CaltechTHESIS
Deposited By: Philip Romero
Deposited On:07 Jun 2012 21:56
Last Modified:08 Nov 2023 00:11

Thesis Files

[img]
Preview
PDF (dissertation) - Final Version
See Usage Policy.

6MB

Repository Staff Only: item control page