A Caltech Library Service

AI for Scientists: Accelerating Discovery Through Knowledge, Data, and Learning


Sun, Jennifer Jianing (2024) AI for Scientists: Accelerating Discovery Through Knowledge, Data, and Learning. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/d6y8-4590.


With rapidly growing amounts of experimental data, machine learning is increasingly crucial for automating scientific data analysis. However, many real-world workflows demand expert-in-the-loop attention and require models that not only interface with data, but also with experts and domain knowledge. My research develops full stack solutions that enable scientists to scalably extract insights from diverse and messy experimental data with minimal supervision. My approaches learn from both data and expert knowledge, while exploiting the right level of domain knowledge for generalization. This thesis presents progress towards developing automated scientist-in-the-loop solutions, including methods that automatically discover meaningful structure from data such as self-supervised keypoints from videos of diverse behaving organisms. We will then discuss methods that use these interpretable structures to inject domain knowledge into the learning process, such as guiding representation learning using symbolic programs of behavioral features computed from keypoints. This work is the result of close collaborations with domain experts, such as behavioral neuroscientists, in order to identify bottlenecks and integrate these methods in real-world workflows. My aim is to enable AI that collaborates with scientists to accelerate the scientific process.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:machine learning; computer vision; AI for science; neurosymbolic models
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Computing and Mathematical Sciences
Awards:Ben P.C. Chou Doctoral Prize in IST, 2023.
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Perona, Pietro (co-advisor)
  • Yue, Yisong (co-advisor)
Thesis Committee:
  • Bouman, Katherine L. (chair)
  • Perona, Pietro
  • Yue, Yisong
  • Chaudhuri, Swarat
  • Kennedy, Ann
Defense Date:15 September 2023
Funding AgencyGrant Number
Natural Sciences and Engineering Research Council of CanadaUNSPECIFIED
Amazon AI4Science FellowshipUNSPECIFIED
Kortchak ScholarshipUNSPECIFIED
Record Number:CaltechTHESIS:11162023-054627670
Persistent URL:
Related URLs:
URLURL TypeDescription for Chapter 4 for Chapter 4 for Chapter 5 for Chapter 5 for Chapter 6 for Chapter 7 for Chapter 9 for Chapter 10 for Chapter 10 for Chapter 11 for Chapter 12
Sun, Jennifer Jianing0000-0002-0906-6589
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:16247
Deposited By: Jennifer Sun
Deposited On:02 Dec 2023 00:58
Last Modified:17 Jun 2024 18:26

Thesis Files

[img] PDF - Final Version
See Usage Policy.


Repository Staff Only: item control page