A Caltech Library Service

Guaranteed Policy Performance in Reinforcement Learning


Voloshin, Cameron (2024) Guaranteed Policy Performance in Reinforcement Learning. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/n2fg-e554.


Decision-making is ubiquitous in everyday life. Increasingly, researchers are seeking answers on how to optimally solve sequential decision-making tasks. Thanks to recent availability of computation, advances in deep learning, and released open-sourced code, it has become easy to train a computational agent to make decisions in many domains. Nevertheless, in realistic scenarios where the consequences of failure are high, running a trained computational agent in the wild poses substantial risk.

The goal of this thesis is to develop and advance techniques that guarantee a learned agent does what we expect it to do. The thesis tackles two central questions:

1) Given an agent, how can we predict if it will perform desirably?

2) Can we structure the learning process to guarantee desirable post-learning performance?

On the former question, this thesis proposes multiple algorithms to evaluate such agents, finds factors that have high influence on the success of agent evaluation, and open-sources benchmarks for further development in the space.

On the latter question, this thesis formulates desirable agent behavior as a constrained optimization with varying types of constraints depending on the structure afforded to the practitioner. Constraining the search space over the learning process ensures post-learning behaviors will, by definition, perform as desired.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:Reinforcement Learning, Policy Learning, Off policy Evaluation
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Computing and Mathematical Sciences
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Yue, Yisong
Thesis Committee:
  • Wierman, Adam C. (chair)
  • Yue, Yisong
  • Bouman, Katherine L.
  • Chaudhuri, Swarat
Defense Date:13 June 2023
Non-Caltech Author Email:clvoloshin (AT)
Record Number:CaltechTHESIS:06062024-061120491
Persistent URL:
Related URLs:
URLURL TypeDescription adapted for Chapter 3 adapted for Chapter 4 adapted for Chapter 5 and 6 adapted for Chapter 7 adapted for Chapter 8
Voloshin, Cameron0009-0007-7725-6660
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:16508
Deposited By: Cameron Voloshin
Deposited On:06 Jun 2024 23:04
Last Modified:14 Jun 2024 21:18

Thesis Files

[img] PDF - Final Version
See Usage Policy.


Repository Staff Only: item control page