Citation
Ding, Weilun (2025) Core Reinforcement Learning Computations Underlying Distinct Behavioral Strategies and their Implications in Psychiatry. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/v41g-an22. https://resolver.caltech.edu/CaltechTHESIS:01222025-071958641
Abstract
Reinforcement learning (RL) models have shown great capabilities in characterizing learning and decision-making in the real world. The dual systems of the model-free (MF) and model-based (MB) algorithms have been proposed to describe the computational mechanism underlying a reflexive habitual control and a cognitive goal-directed control, respectively. Given the dual systems under control, it is worth asking how the choice of which system to use is made with the changing environmental statistics of rewards and states. In Chapter 2, three types of prediction error signals from the dual systems are found to guide the arbitration process in a reliability-based RL framework. Moreover, an exploratory analysis was conducted to test for alternative arbitration theories that utilize the cost-benefit analysis on the goal-directed (or MB) system. Understanding learning and decision-making would not be complete without knowing how our neural machinery implements these RL computations when a given system is engaged. The robustness and replicability of neural encoding of learning and decision signals from the MF and MB systems are essential to set a reassuring path for future neurocomputational work on dual systems. In Chapter 3, we address recent concerns over the existence of the MF system and its neural computations in a widely-used Markov decision task (two-step task). By applying a model-based functional magnetic resonance imaging (fMRI) approach to a large number of participants, we found both MF and MB learning signals in the human striatum and that neural patterns of decision utility across different RL-strategy groups further add to the evidence of ubiquitous MF computations in Markov decisions. It turns out the framework of dual systems could not only account for normal learning behaviors but also inform us of what actually goes wrong in mental disorders. In Chapter 4, we show that, via the reliability-based arbitration framework, the MF behavioral bias observed in participants with high obsessive-compulsive tendency could be attributed to an enhanced encoding of MB reward prediction error in the anterior cingulate cortex, a region previously implicated in the error-monitoring process. Chapter 1 introduces basic concepts and example algorithms in RL; we also review relevant theoretical and neuroscientific works to build the knowledge base for subsequent chapters. Chapter 5 discusses the significance of empirical findings in this thesis, the values of adopting some of the methodologies herein, potential future research directions on the dual systems, and implications in computational psychiatry.
Item Type: | Thesis (Dissertation (Ph.D.)) | ||||
---|---|---|---|---|---|
Subject Keywords: | reinforcement learning; individual differences; functional MRI; obsessive-compulsive tendency | ||||
Degree Grantor: | California Institute of Technology | ||||
Division: | Humanities and Social Sciences | ||||
Major Option: | Social and Decision Neuroscience | ||||
Thesis Availability: | Public (worldwide access) | ||||
Research Advisor(s): |
| ||||
Thesis Committee: |
| ||||
Defense Date: | 25 November 2024 | ||||
Non-Caltech Author Email: | weilunding1994 (AT) gmail.com | ||||
Record Number: | CaltechTHESIS:01222025-071958641 | ||||
Persistent URL: | https://resolver.caltech.edu/CaltechTHESIS:01222025-071958641 | ||||
DOI: | 10.7907/v41g-an22 | ||||
ORCID: |
| ||||
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||
ID Code: | 16957 | ||||
Collection: | CaltechTHESIS | ||||
Deposited By: | Weilun Ding | ||||
Deposited On: | 10 Feb 2025 18:03 | ||||
Last Modified: | 18 Feb 2025 18:02 |
Thesis Files
![]() |
PDF
- Final Version
See Usage Policy. 10MB |
Repository Staff Only: item control page