Citation
Levine, Matthew Emanuel (2023) Machine Learning and Data Assimilation for Blending Incomplete Models and Noisy Data. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/b82h-ye78. https://resolver.caltech.edu/CaltechTHESIS:06012023-213052258
Abstract
The prediction and inference of dynamical systems is of widespread interest across scientific and engineering disciplines. Data assimilation (DA) offers a well-established and successful paradigm for blending such models with noisy observational data. However, traditional DA-based inference often fails when available data are insufficiently informative. Chapter 2 copes with this challenge by introducing constraints into Ensemble Kalman Filtering, which is shown to improve forecasting of glucose dynamics in real patient-level clinical data. Chapter 3 addresses this identifiability challenge by instead developing a simplified, reduced-order stochastic model for glucose dynamics that is more easily identified from patient data. Despite these successes, the forecasting performance of the methods are fundamentally limited by the fidelity of the employed model, which is often not fully understood a priori.
Chapter 4 presents a general picture of how noisy, partially-observed time-series data can be used to learn flexible (e.g., neural network-based) corrections to a pre-specified mechanistic model. In Chapter 5, the proposed methodology is then validated in simulated settings for glucose-insulin models. Chapter 6 provides further perspective on learning flexible model corrections, comparing approaches that use i) gradient-based or gradient-free optimization, ii) temporal or time-averaged data, iii) different model parameterizations, iv) deterministic and stochastic corrections, and v) physical conservation laws to constrain inference.
Chapter 7 studies how these perspectives on machine learning and dynamical systems can help us understand the roles of biochemical networks. In particular, it considers protein dimerization networks from the lens of approximation theory and evaluates how the equilibria of these networks can be fine-tuned to perform a variety of biological computations.
Item Type: | Thesis (Dissertation (Ph.D.)) | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subject Keywords: | Machine Learning, Data Assimilation, Dynamical Systems, | ||||||||||||||||||
Degree Grantor: | California Institute of Technology | ||||||||||||||||||
Division: | Engineering and Applied Science | ||||||||||||||||||
Major Option: | Computing and Mathematical Sciences | ||||||||||||||||||
Thesis Availability: | Public (worldwide access) | ||||||||||||||||||
Research Advisor(s): |
| ||||||||||||||||||
Thesis Committee: |
| ||||||||||||||||||
Defense Date: | 3 May 2023 | ||||||||||||||||||
Funders: |
| ||||||||||||||||||
Record Number: | CaltechTHESIS:06012023-213052258 | ||||||||||||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechTHESIS:06012023-213052258 | ||||||||||||||||||
DOI: | 10.7907/b82h-ye78 | ||||||||||||||||||
Related URLs: |
| ||||||||||||||||||
ORCID: |
| ||||||||||||||||||
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||||||||||||||
ID Code: | 15264 | ||||||||||||||||||
Collection: | CaltechTHESIS | ||||||||||||||||||
Deposited By: | Matthew Levine | ||||||||||||||||||
Deposited On: | 02 Jun 2023 15:24 | ||||||||||||||||||
Last Modified: | 09 Jun 2023 18:50 |
Thesis Files
PDF
- Final Version
See Usage Policy. 18MB |
Repository Staff Only: item control page