A Caltech Library Service

Misfolding Dominates Protein Evolution


Drummond, David Allan (2006) Misfolding Dominates Protein Evolution. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/DH8E-2N10.


The diverse array of protein functions depends upon these molecules' reliable ability to fold into the native structures determined by their amino-acid sequences. Because mutations that alter a protein's sequence frequently disrupt its folding, protein evolution explores protein sequence space conservatively, either by point mutations or recombination between related sequences. Attempts to engineer proteins by co-opting the evolutionary algorithm have also largely proceeded by the stepwise accumulation of beneficial mutations. Other strategies for directed evolution have focused on introducing many mutations at once as a way to increase the likelihood of finding improved variants, attempting to balance higher mutational diversity with lower retention of folding. Using simple models, I explore this tradeoff and find that protein misfolding dominates whether increasing mutation levels increase the number of improved variants. I analyze results of a popular mutagenesis protocol, error-prone PCR, for evidence that coupling between mutations might favor higher mutation levels, as claimed by several groups. A comparison of high-mutation-rate mutagenesis to protein recombination between distantly related proteins reveals qualitative differences in protein tolerance for sequence changes introduced by each method. Mutational tolerance may also be reflected in the rate at which proteins accumulate sequence changes over evolutionary time; why proteins evolve at different rates remains a major open question in biology. An analysis of rate determinants suggests that one major variable, linked to how highly expressed the encoding gene is, dominates the rate of yeast protein evolution. To explain this trend, I hypothesize that proteins are selected to fold properly despite mistranslation, a property I call translational robustness, and test it using genomic data. To examine protein evolution at a higher level of detail, a large-scale simulation is constructed in which simulated organisms, with genomes containing genes expressing computationally foldable proteins at different levels, evolve over millions of generations with protein misfolding imposing the only fitness cost. The results suggest that protein misfolding suffices to explain many significant trends in genome evolution observed across taxa, predict a novel genomic trend which is then identified in yeast, and create insight into the causes of evolutionary rate variation in proteins.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:directed evolution; error-prone PCR; evolutionary rate; protein misfolding; recombination; translational robustness
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Computation and Neural Systems
Awards:Milton and Francis Clauser Doctoral Prize, 2006. Demetriades-Tsafka-Kokkalis Prize in Bioengineering or Related Fields, 2006. Everhart Distinguished Graduate Student Lecturer Award, 2006.
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Arnold, Frances Hamilton
Thesis Committee:
  • Arnold, Frances Hamilton (chair)
  • Adami, Christoph Carl
  • Winfree, Erik
  • Bruck, Jehoshua
  • Elowitz, Michael B.
Defense Date:15 May 2006
Record Number:CaltechETD:etd-06022006-154329
Persistent URL:
Drummond, David Allan0000-0001-7018-7059
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:2404
Deposited By: Imported from ETD-db
Deposited On:05 Jun 2006
Last Modified:08 Nov 2023 00:11

Thesis Files

PDF (drummond-thesis.pdf) - Final Version
See Usage Policy.


Repository Staff Only: item control page