CaltechTHESIS
  A Caltech Library Service

Statistical Foundations of Operator Learning

Citation

Nelsen, Nicholas Hao (2024) Statistical Foundations of Operator Learning. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/0246-7574. https://resolver.caltech.edu/CaltechTHESIS:05142024-215222393

Abstract

This thesis studies operator learning from a statistical perspective. Operator learning uses observed data to estimate mappings between infinite-dimensional spaces. It does so at the conceptually continuum level, leading to discretization-independent machine learning methods when implemented in practice. Although this framework shows promise for physical model acceleration and discovery, the mathematical theory of operator learning lags behind its empirical success. Motivated by scientific computing and inverse problems where the available data are often scarce, this thesis develops scalable algorithms for operator learning and theoretical insights into their data efficiency.

The thesis begins by introducing a convergent operator learning algorithm that is implementable on a computer with controlled complexity. The method is based on linear combinations of function-valued random features, enjoys efficient training via convex optimization, and accurately approximates nonlinear solution operators of parametric partial differential equations. A statistical analysis derives state-of-the-art error bounds for the method and establishes its robustness to errors stemming from noisy observations and model misspecification. Next, the thesis tackles fundamental statistical questions about how problem structure, data quality, and prior information influence learning accuracy. Specializing to a linear setting, a sharp Bayesian nonparametric analysis shows that continuum linear operators, such as the integration or differentiation of spatially varying functions, are provably learnable from noisy input-output pairs. The theory reveals that smoothing operators are easier to learn than unbounded ones and that training with rough or high-frequency input data improves sample complexity. When only specific linear functionals of the operator’s output are the primary quantities of interest, the final part of the thesis proves that the smoothness of the functionals determines whether learning directly from these finite-dimensional observations carries a statistical advantage over plug-in estimators based on learning the entire operator. To validate the findings beyond linear problems, the thesis develops practical deep operator learning architectures for nonlinear mappings that send functions to vectors, or vice versa, and shows their corresponding universal approximation properties. Altogether, this thesis advances the reliability and efficiency of operator learning for continuum problems in the physical and data sciences.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:scientific machine learning; learning theory; numerical analysis; inverse problems; functional data analysis; Bayesian nonparametric statistics; sample complexity; random features; ridge regression; parameter-to-observable maps
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Mechanical Engineering
Minor Option:Applied And Computational Mathematics
Awards:The W.P. Carey and Co. Prize in Applied Mathematics, 2024. Centennial Prize for the Best Thesis in Mechanical and Civil Engineering, 2024. SIAM Review SIGEST Award (Ch. 2), 2024. NeurIPS Spotlight (Ch. 3), 2023.
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Stuart, Andrew M.
Thesis Committee:
  • Bhattacharya, Kaushik (chair)
  • Colonius, Tim
  • Owhadi, Houman
  • Stuart, Andrew M.
Defense Date:29 April 2024
Funders:
Funding AgencyGrant Number
Amazon AI4Science FellowshipUNSPECIFIED
Caltech Guggenheim Graduate FellowshipUNSPECIFIED
NSF Graduate Research FellowshipDGE-1745301
Office of Naval Research (ONR)N00014-22-1-2790
Office of Naval Research (ONR)N00014-19-1-2408
NSFAGS-1835860
NSFDMS-1818977
Air Force Office of Scientific Research (AFOSR)FA9550-20-1-0358
Army Research Office (ARO)W911NF-12-2-0022
Record Number:CaltechTHESIS:05142024-215222393
Persistent URL:https://resolver.caltech.edu/CaltechTHESIS:05142024-215222393
DOI:10.7907/0246-7574
Related URLs:
URLURL TypeDescription
https://www.nicholashnelsen.com/AuthorPersonal website
https://doi.org/10.1137/20M133957XDOIArticle adapted for Ch. 2
https://doi.org/10.1137/24M1648703DOIExcerpts of this article are adapted for Ch. 1 and Ch. 2
https://doi.org/10.48550/arXiv.2305.17170DOIArticle adapted for Ch. 3
https://doi.org/10.1137/21M1442942DOIArticle adapted for Ch. 4
https://doi.org/10.48550/arXiv.2402.06031DOIArticle adapted for Ch. 5
https://doi.org/10.22002/55tdh-hda68Related ItemData for Ch. 2
https://doi.org/10.22002/r5ga1-55d06Related ItemData for Ch. 5
https://github.com/nickhnelsen/random-features-banachAuthorCode for Ch. 2
https://github.com/nickhnelsen/error-bounds-for-vvRFAuthorCode for Ch. 3
https://github.com/nickhnelsen/fourier-neural-mappingsAuthorCode for Ch. 5
https://github.com/CliMA/RandomFeatures.jlRelated ItemCode related to Ch. 2 and Ch. 3
https://neurips.cc/virtual/2023/poster/70274Related ItemPoster related to Ch. 3
ORCID:
AuthorORCID
Nelsen, Nicholas Hao0000-0002-8328-1199
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:16383
Collection:CaltechTHESIS
Deposited By: Nicholas Nelsen
Deposited On:20 May 2024 23:40
Last Modified:17 Jun 2024 18:42

Thesis Files

[img] PDF - Final Version
See Usage Policy.

3MB

Repository Staff Only: item control page