A Caltech Library Service

Design and Analysis of Combinatorial Protein Libraries Created by Site-Directed Recombination


Endelman, Jeffrey B. (2005) Design and Analysis of Combinatorial Protein Libraries Created by Site-Directed Recombination. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/GF35-Q269.


For many protein design problems, limited understanding of the relationship between sequence and function necessitates searching through a library of proteins to find the properties of interest. To accelerate this process, molecular models and optimization algorithms can be combined to design diverse libraries enriched in folded proteins. I apply this strategy to site-directed recombination, in which an alignment of p homologs is partitioned into f blocks, and the resulting gene fragments are combinatorially assembled to create a library with p^f chimeric sequences. To design the fragments, I present a dynamic programming algorithm that minimizes the average energy of the library, subject to constraints on fragment length. This algorithm works for any pairwise residue potential, several of which are compared for their ability to predict which chimeras retain the parental function and/or fold. The alignments of folded and unfolded chimeras are used to generate sequence-function relationships via logistic regression, a technique for fitting models to binary data. Compared to methods developed for alignments of naturally occurring proteins, logistic regression more readily distinguishes true interactions from correlations between strongly stabilizing but non-interacting residues.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:directed evolution; logistic regression; optimization; protein design
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Bioengineering
Awards:Demetriades-Tsafka-Kokkalis Prize in Bioengineering or Related Fields, 2005.
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Arnold, Frances Hamilton
Thesis Committee:
  • Arnold, Frances Hamilton (chair)
  • Pierce, Niles A.
  • Mayo, Stephen L.
  • Wang, Zhen-Gang
Defense Date:12 May 2005
Record Number:CaltechETD:etd-06022005-192548
Persistent URL:
Endelman, Jeffrey B.0000-0003-0957-4337
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:2395
Deposited By: Imported from ETD-db
Deposited On:03 Jun 2005
Last Modified:08 Nov 2023 00:11

Thesis Files

PDF - Final Version
See Usage Policy.


Repository Staff Only: item control page