A Caltech Library Service

Protein design automation : principles and practice


Dahiyat, Bassil I. (1998) Protein design automation : principles and practice. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/08j5-w532.


We have conceived and implemented a cyclical protein design strategy that couples theory, computation and experimental testing. Our goal is an objective, quantitative design algorithm that is based on the physical properties that determine protein structure and stability and which is not limited to specific folds or motifs. Such a method should escape the lack of generality that has resulted from design approaches based on system-specific heuristics and/or subjective considerations. A critical component of the development of our methods has been their experimental testing and validation. The use of a design cycle coupling theory, computation, and experiment has improved our understanding of the physical chemistry governing protein design and hence enhanced the performance of the design algorithm.

Our protein design automation algorithm objectively predicts protein sequences likely to achieve a desired fold by using a side-chain selection algorithm that explicitly and quantitatively considers specific side-chain to backbone and side-chain to side-chain interactions. Using a rotamer description of the side chains, we implemented a fast discrete search algorithm based on the Dead End Elimination Theorem to rapidly find the globally optimal sequence in its optimal geometry. We subdivided the sequence selection problem into regions of proteins expected to be dominated by different factors: the tightly packed buried core, the solvent exposed surface, and the boundary between core and surface. We assessed the accuracy of a scoring function or combination of scoring functions by experimentally testing their sequence predictions. Improvements to the scoring function were derived from the experimental data and incorporated into the design algorithm. In this manner, we developed a scoring function for the core of a protein that considers packing interactions and hydrophobic solvation energy. In order to design boundary residues effectively, the usually neglected effect of exposed hydrophobic surface area was addressed. Scoring functions for the design of surface residues were developed that account for hydrogen bonding interactions and secondary structure propensities of amino acids. These potential functions were used to successfully redesign several proteins. The integration of these scoring functions was tested by designing the sequence for an entire protein and solving the NMR solution structure of the designed protein. This work reports the first successful automated design and experimental validation of a novel sequence for an entire protein.

Item Type:Thesis (Dissertation (Ph.D.))
Degree Grantor:California Institute of Technology
Division:Chemistry and Chemical Engineering
Major Option:Chemistry
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Mayo, Stephen L.
Thesis Committee:
  • Mayo, Stephen L. (chair)
  • Goddard, William A., III
  • Rees, Douglas C.
  • Chan, Sunney I.
Defense Date:5 August 1997
Record Number:CaltechETD:etd-05132009-113001
Persistent URL:
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:1785
Deposited By: Imported from ETD-db
Deposited On:13 May 2009
Last Modified:16 Apr 2021 22:24

Thesis Files

PDF (Dahiyat_bi_1998.pdf) - Final Version
See Usage Policy.


Repository Staff Only: item control page