CaltechTHESIS
  A Caltech Library Service

Computational Methods for Simulating and Parameterizing Nucleic Acid Secondary Structure Thermodynamics and Kinetics

Citation

Fornace, Mark Evan (2022) Computational Methods for Simulating and Parameterizing Nucleic Acid Secondary Structure Thermodynamics and Kinetics. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/ayeg-at42. https://resolver.caltech.edu/CaltechTHESIS:12212021-202554260

Abstract

Nucleic acid secondary structure models offer a simplified but powerful lens through which to view, analyze, and design nucleic acid chemistry. Computational approaches based on such models are central to current research directions across molecular programming, synthetic biology, and the life sciences more broadly. Our framework combines three ingredients. First, we develop new recursions to include contributions from coaxial and dangle stacking in an efficient and principled way. Second, we formulate the concept of an evaluation algebra, which defines the mathematical form of each subproblem in the dynamic program. Whereas previous modeling efforts have relied on case-by-case handling of different thermodynamic quantities, we use evaluation algebras to elegantly and efficiently compute a variety of physical quantities using the same recursions. Third, we develop efficient operation orders for a variety of physical quantities of experimental interest. Combining our advances, we are able to achieve speedups of 20-120x and scalable calculations of complexes of up to 30,000 nucleotides. Our achievements promise to dramatically expand the scope and utility of computational analysis and design of nucleic acid thermodynamics.

While current dynamic programming algorithms achieve efficient computation of thermodynamic quantities for a given nucleic acid sequence, they do not provide kinetic information. Therefore, investigations of secondary structure kinetics rely on stochastic simulations of trajectories in secondary structure space. We improve upon these simulation methodologies to achieve lower computational complexities and large empirical speedups. We extend our algorithms to an ensemble which fully includes coaxial and dangle stacking states, expanding the scope of the kinetic analysis that is currently possible.

Current secondary structure models are parametrized using thermodynamic information gleaned from decades of melt experiments of RNA and DNA in specific experimental conditions. Only rough kinetic information is currently available from past experiments, and information on solvent and material dependence is lacking. We develop a fully computational approach based on Gaussian processes and molecular dynamics in order to provide a generic method for estimating thermodynamic and kinetic parameters, applicable conceptually to any nucleic acid material and experimental setting of interest. Our methodology offers an atomistic view of nucleic acid base pairing and faithfully reproduces most experimental data. It thus provides a powerful black-box approach for extensibly calculating the kinetic and thermodynamic parameters that secondary structure models require.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:RNA, DNA, secondary structure, dynamic programming, molecular dynamics, nucleic acid, parametrization, nearest-neighbor, Markov chain, simulation
Degree Grantor:California Institute of Technology
Division:Chemistry and Chemical Engineering
Major Option:Chemistry
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Pierce, Niles A.
Thesis Committee:
  • Miller, Thomas F. (chair)
  • Chan, Garnet K.
  • Wang, Zhen-Gang
  • Winfree, Erik
  • Pierce, Niles A.
Defense Date:7 December 2021
Record Number:CaltechTHESIS:12212021-202554260
Persistent URL:https://resolver.caltech.edu/CaltechTHESIS:12212021-202554260
DOI:10.7907/ayeg-at42
Related URLs:
URLURL TypeDescription
https://doi.org/10.1021/acssynbio.9b00523DOIArticle adapted for Chapter II and Appendix A
ORCID:
AuthorORCID
Fornace, Mark Evan0000-0002-5829-5839
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:14456
Collection:CaltechTHESIS
Deposited By: Mark Fornace
Deposited On:28 Jan 2022 17:03
Last Modified:19 May 2022 00:04

Thesis Files

[img] PDF (Partial thesis. Chapters 3-5 temporarily embargoed) - Final Version
See Usage Policy.

14MB

Repository Staff Only: item control page