Statistical Foundations of Operator Learning

Citation

Nelsen, Nicholas Hao (2024) Statistical Foundations of Operator Learning. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/0246-7574. https://resolver.caltech.edu/CaltechTHESIS:05142024-215222393

Abstract

This thesis studies operator learning from a statistical perspective. Operator learning uses observed data to estimate mappings between infinite-dimensional spaces. It does so at the conceptually continuum level, leading to discretization-independent machine learning methods when implemented in practice. Although this framework shows promise for physical model acceleration and discovery, the mathematical theory of operator learning lags behind its empirical success. Motivated by scientific computing and inverse problems where the available data are often scarce, this thesis develops scalable algorithms for operator learning and theoretical insights into their data efficiency.

The thesis begins by introducing a convergent operator learning algorithm that is implementable on a computer with controlled complexity. The method is based on linear combinations of function-valued random features, enjoys efficient training via convex optimization, and accurately approximates nonlinear solution operators of parametric partial differential equations. A statistical analysis derives state-of-the-art error bounds for the method and establishes its robustness to errors stemming from noisy observations and model misspecification. Next, the thesis tackles fundamental statistical questions about how problem structure, data quality, and prior information influence learning accuracy. Specializing to a linear setting, a sharp Bayesian nonparametric analysis shows that continuum linear operators, such as the integration or differentiation of spatially varying functions, are provably learnable from noisy input-output pairs. The theory reveals that smoothing operators are easier to learn than unbounded ones and that training with rough or high-frequency input data improves sample complexity. When only specific linear functionals of the operator’s output are the primary quantities of interest, the final part of the thesis proves that the smoothness of the functionals determines whether learning directly from these finite-dimensional observations carries a statistical advantage over plug-in estimators based on learning the entire operator. To validate the findings beyond linear problems, the thesis develops practical deep operator learning architectures for nonlinear mappings that send functions to vectors, or vice versa, and shows their corresponding universal approximation properties. Altogether, this thesis advances the reliability and efficiency of operator learning for continuum problems in the physical and data sciences.

Item Type:

Thesis (Dissertation (Ph.D.))

Subject Keywords:

scientific machine learning; learning theory; numerical analysis; inverse problems; functional data analysis; Bayesian nonparametric statistics; sample complexity; random features; ridge regression; parameter-to-observable maps

Degree Grantor:

California Institute of Technology

Division:

Engineering and Applied Science

Major Option:

Mechanical Engineering

Minor Option:

Applied And Computational Mathematics

Awards:

The W.P. Carey and Co. Prize in Applied Mathematics, 2024. Centennial Prize for the Best Thesis in Mechanical and Civil Engineering, 2024. SIAM Review SIGEST Award (Ch. 2), 2024. NeurIPS Spotlight (Ch. 3), 2023.

Thesis Availability:

Public (worldwide access)

Research Advisor(s):

Stuart, Andrew M.

Thesis Committee:

Bhattacharya, Kaushik (chair)
Colonius, Tim
Owhadi, Houman
Stuart, Andrew M.

Defense Date:

29 April 2024

Funders:

Funding Agency	Grant Number
Amazon AI4Science Fellowship	UNSPECIFIED
Caltech Guggenheim Graduate Fellowship	UNSPECIFIED
NSF Graduate Research Fellowship	DGE-1745301
Office of Naval Research (ONR)	N00014-22-1-2790
Office of Naval Research (ONR)	N00014-19-1-2408
NSF	AGS-1835860
NSF	DMS-1818977
Air Force Office of Scientific Research (AFOSR)	FA9550-20-1-0358
Army Research Office (ARO)	W911NF-12-2-0022

Record Number:

CaltechTHESIS:05142024-215222393

Persistent URL:

https://resolver.caltech.edu/CaltechTHESIS:05142024-215222393

DOI:

10.7907/0246-7574

Related URLs:

URL	URL Type	Description
https://www.nicholashnelsen.com/	Author	Personal website
https://doi.org/10.1137/20M133957X	DOI	Article adapted for Ch. 2
https://doi.org/10.1137/24M1648703	DOI	Excerpts of this article are adapted for Ch. 1 and Ch. 2
https://doi.org/10.48550/arXiv.2305.17170	DOI	Article adapted for Ch. 3
https://doi.org/10.1137/21M1442942	DOI	Article adapted for Ch. 4
https://doi.org/10.48550/arXiv.2402.06031	DOI	Article adapted for Ch. 5
https://doi.org/10.22002/55tdh-hda68	Related Item	Data for Ch. 2
https://doi.org/10.22002/r5ga1-55d06	Related Item	Data for Ch. 5
https://github.com/nickhnelsen/random-features-banach	Author	Code for Ch. 2
https://github.com/nickhnelsen/error-bounds-for-vvRF	Author	Code for Ch. 3
https://github.com/nickhnelsen/fourier-neural-mappings	Author	Code for Ch. 5
https://github.com/CliMA/RandomFeatures.jl	Related Item	Code related to Ch. 2 and Ch. 3
https://neurips.cc/virtual/2023/poster/70274	Related Item	Poster related to Ch. 3

ORCID:

Author	ORCID
Nelsen, Nicholas Hao	0000-0002-8328-1199

Default Usage Policy:

No commercial reproduction, distribution, display or performance rights in this work are provided.

ID Code:

16383

Collection:

CaltechTHESIS

Deposited By:

Nicholas Nelsen

Deposited On:

20 May 2024 23:40

Last Modified:

28 May 2025 22:25

Thesis Files

PDF - Final Version
See Usage Policy.
3MB

Repository Staff Only: item control page