Citation
Novoseller, Ellen Rachel (2021) Online Learning from Human Feedback with Applications to Exoskeleton Gait Optimization. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/gvtx-1586. https://resolver.caltech.edu/CaltechTHESIS:12092020-162149429
Abstract
Systems that intelligently interact with humans could improve people's lives in numerous ways and in numerous settings, such as households, hospitals, and workplaces. Yet, developing algorithms that reliably and efficiently personalize their interactions with people in real-world environments remains challenging. In particular, one major difficulty lies in adapting to human-in-the-loop feedback, in which an algorithm makes sequential decisions while receiving online feedback from humans; throughout this interaction, the algorithm seeks to optimize its decision-making quality, as measured by the utility of its performance to the human users. Such algorithms must balance between exploration and exploitation: on one hand, the algorithm must select uncertain strategies to fully explore the environment and the interacting human's preferences, while on the other hand, it must exploit the empirically-best-performing strategies to maximize its cumulative performance.
Learning from human feedback can be difficult, as people are often unreliable in specifying numerical scores. In contrast, humans can often more accurately provide various types of qualitative feedback, for instance pairwise preferences. Yet, sample efficiency is a significant concern in human-in-the-loop settings, as qualitative feedback is less informative than absolute metrics, and algorithms can typically pose only limited queries to human users. Thus, there is a need to create theoretically-grounded online learning algorithms that efficiently, reliably, and robustly optimize their interactions with humans while learning from online qualitative feedback.
This dissertation makes several contributions to algorithm design for human-in-the-loop learning. Firstly, this work develops the Dueling Posterior Sampling (DPS) algorithmic framework, a model-based, Bayesian approach for online learning in the settings of preference-based reinforcement learning and generalized linear dueling bandits. DPS is developed together with a theoretical regret analysis framework, and yields competitive empirical performance in a range of simulations. Additionally, this thesis presents the CoSpar and LineCoSpar algorithms for sample-efficient, mixed-initiative learning from pairwise preferences and coactive feedback. CoSpar and LineCoSpar are both deployed in human subject experiments with a lower-body exoskeleton to identify optimal, user-preferred exoskeleton walking gaits. This work presents the first demonstration of preference-based learning for optimizing dynamic crutchless exoskeleton walking for user comfort, and makes progress toward customizing exoskeletons and other assistive devices for individual users.
Item Type: | Thesis (Dissertation (Ph.D.)) | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subject Keywords: | Human-in-the-loop learning; online learning; bandits; reinforcement learning; exoskeleton | |||||||||||||||||||||
Degree Grantor: | California Institute of Technology | |||||||||||||||||||||
Division: | Engineering and Applied Science | |||||||||||||||||||||
Major Option: | Control and Dynamical Systems | |||||||||||||||||||||
Awards: | Thomas A. Tisch Prize for Graduate Teaching in Computing and Mathematical Sciences, 2018. | |||||||||||||||||||||
Thesis Availability: | Public (worldwide access) | |||||||||||||||||||||
Research Advisor(s): |
| |||||||||||||||||||||
Thesis Committee: |
| |||||||||||||||||||||
Defense Date: | 30 November 2020 | |||||||||||||||||||||
Funders: |
| |||||||||||||||||||||
Record Number: | CaltechTHESIS:12092020-162149429 | |||||||||||||||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechTHESIS:12092020-162149429 | |||||||||||||||||||||
DOI: | 10.7907/gvtx-1586 | |||||||||||||||||||||
Related URLs: |
| |||||||||||||||||||||
ORCID: |
| |||||||||||||||||||||
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | |||||||||||||||||||||
ID Code: | 14021 | |||||||||||||||||||||
Collection: | CaltechTHESIS | |||||||||||||||||||||
Deposited By: | Ellen Novoseller | |||||||||||||||||||||
Deposited On: | 18 Dec 2020 17:42 | |||||||||||||||||||||
Last Modified: | 03 Nov 2021 20:21 |
Thesis Files
PDF
- Final Version
Creative Commons Attribution Non-commercial. 12MB |
Repository Staff Only: item control page