A Caltech Library Service

Inference, Computation, and Games


Schäfer, Florian Tobias (2021) Inference, Computation, and Games. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/esyv-2181.


In this thesis, we use statistical inference and competitive games to design algorithms for computational mathematics.

In the first part, comprising chapters two through six, we use ideas from Gaussian process statistics to obtain fast solvers for differential and integral equations. We begin by observing the equivalence of conditional (near-)independence of Gaussian processes and the (near-)sparsity of the Cholesky factors of its precision and covariance matrices. This implies the existence of a large class of dense matrices with almost sparse Cholesky factors, thereby greatly increasing the scope of application of sparse Cholesky factorization. Using an elimination ordering and sparsity pattern motivated by the screening effect in spatial statistics, we can compute approximate Cholesky factors of the covariance matrices of Gaussian processes admitting a screening effect in near-linear computational complexity. These include many popular smoothness priors such as the Matérn class of covariance functions. In the special case of Green's matrices of elliptic boundary value problems (with possibly unknown elliptic operators of arbitrarily high order, with possibly rough coefficients), we can use tools from numerical homogenization to prove the exponential accuracy of our method. This result improves the state-of-the-art for solving general elliptic integral equations and provides the first proof of an exponential screening effect. We also derive a fast solver for elliptic partial differential equations, with accuracy-vs-complexity guarantees that improve upon the state-of-the-art. Furthermore, the resulting solver is performant in practice, frequently beating established algebraic multigrid libraries such as AMGCL and Trilinos on a series of challenging problems in two and three dimensions. Finally, for any given covariance matrix, we obtain a closed-form expression for its optimal (in terms of Kullback-Leibler divergence) approximate inverse-Cholesky factorization subject to a sparsity constraint, recovering Vecchia approximation and factorized sparse approximate inverses. Our method is highly robust, embarrassingly parallel, and further improves our asymptotic results on the solution of elliptic integral equations. We also provide a way to apply our techniques to sums of independent Gaussian processes, resolving a major limitation of existing methods based on the screening effect. As a result, we obtain fast algorithms for large-scale Gaussian process regression problems with possibly noisy measurements.

In the second part of this thesis, comprising chapters seven through nine, we study continuous optimization through the lens of competitive games. In particular, we consider competitive optimization, where multiple agents attempt to minimize conflicting objectives. In the single-agent case, the updates of gradient descent are minimizers of quadratically regularized linearizations of the loss function. We propose to generalize this idea by using the Nash equilibria of quadratically regularized linearizations of the competitive game as updates (linearize the game). We provide fundamental reasons why the natural notion of linearization for competitive optimization problems is given by the multilinear (as opposed to linear) approximation of the agents' loss functions. The resulting algorithm, which we call competitive gradient descent, thus provides a natural generalization of gradient descent to competitive optimization. By using ideas from information geometry, we extend CGD to competitive mirror descent (CMD) that can be applied to a vast range of constrained competitive optimization problems. CGD and CMD resolve the cycling problem of simultaneous gradient descent and show promising results on problems arising in constrained optimization, robust control theory, and generative adversarial networks. Finally, we point out the GAN-dilemma that refutes the common interpretation of GANs as approximate minimizers of a divergence obtained in the limit of a fully trained discriminator. Instead, we argue that GAN performance relies on the implicit competitive regularization (ICR) due to the simultaneous optimization of generator and discriminator and support this hypothesis with results on low-dimensional model problems and GANs on CIFAR10.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:Gaussian process regression; numerical analysis; numerical homogenization; partial differential equations; numerical linear algebra; Cholesky factorization; screening effect; competitive optimization; constrained optimization; information geometry; generative adversarial networks
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Applied And Computational Mathematics
Awards:The W.P. Carey and Co. Prize in Applied Mathematics, 2021.
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Owhadi, Houman
Thesis Committee:
  • Schroeder, Peter (chair)
  • Anandkumar, Anima
  • Desbrun, Mathieu
  • Owhadi, Houman
  • Tropp, Joel A.
Defense Date:19 May 2021
Non-Caltech Author Email:flotosch (AT)
Funding AgencyGrant Number
DARPA EQUiPS ProgramFA9550-16-1-0054
Air Force Office of Scientific Research (AFOSR)FA9550-18-1-0271
Linde Institute Research GrantUNSPECIFIED
Amazon AI4Science FellowshipUNSPECIFIED
Record Number:CaltechTHESIS:06082021-005706263
Persistent URL:
Schäfer, Florian Tobias0000-0002-4891-0172
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:14261
Deposited By: Florian Schaefer
Deposited On:08 Jun 2021 19:03
Last Modified:03 Nov 2021 18:39

Thesis Files

[img] PDF - Final Version
See Usage Policy.


Repository Staff Only: item control page