Citation
Sun, Jennifer Jianing (2024) AI for Scientists: Accelerating Discovery Through Knowledge, Data, and Learning. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/d6y8-4590. https://resolver.caltech.edu/CaltechTHESIS:11162023-054627670
Abstract
With rapidly growing amounts of experimental data, machine learning is increasingly crucial for automating scientific data analysis. However, many real-world workflows demand expert-in-the-loop attention and require models that not only interface with data, but also with experts and domain knowledge. My research develops full stack solutions that enable scientists to scalably extract insights from diverse and messy experimental data with minimal supervision. My approaches learn from both data and expert knowledge, while exploiting the right level of domain knowledge for generalization. This thesis presents progress towards developing automated scientist-in-the-loop solutions, including methods that automatically discover meaningful structure from data such as self-supervised keypoints from videos of diverse behaving organisms. We will then discuss methods that use these interpretable structures to inject domain knowledge into the learning process, such as guiding representation learning using symbolic programs of behavioral features computed from keypoints. This work is the result of close collaborations with domain experts, such as behavioral neuroscientists, in order to identify bottlenecks and integrate these methods in real-world workflows. My aim is to enable AI that collaborates with scientists to accelerate the scientific process.
Item Type: | Thesis (Dissertation (Ph.D.)) | ||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subject Keywords: | machine learning; computer vision; AI for science; neurosymbolic models | ||||||||||||||||||||||||||||||||||||
Degree Grantor: | California Institute of Technology | ||||||||||||||||||||||||||||||||||||
Division: | Engineering and Applied Science | ||||||||||||||||||||||||||||||||||||
Major Option: | Computing and Mathematical Sciences | ||||||||||||||||||||||||||||||||||||
Awards: | Ben P.C. Chou Doctoral Prize in IST, 2023. | ||||||||||||||||||||||||||||||||||||
Thesis Availability: | Public (worldwide access) | ||||||||||||||||||||||||||||||||||||
Research Advisor(s): |
| ||||||||||||||||||||||||||||||||||||
Thesis Committee: |
| ||||||||||||||||||||||||||||||||||||
Defense Date: | 15 September 2023 | ||||||||||||||||||||||||||||||||||||
Funders: |
| ||||||||||||||||||||||||||||||||||||
Record Number: | CaltechTHESIS:11162023-054627670 | ||||||||||||||||||||||||||||||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechTHESIS:11162023-054627670 | ||||||||||||||||||||||||||||||||||||
DOI: | 10.7907/d6y8-4590 | ||||||||||||||||||||||||||||||||||||
Related URLs: |
| ||||||||||||||||||||||||||||||||||||
ORCID: |
| ||||||||||||||||||||||||||||||||||||
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||||||||||||||||||||||||||||||||
ID Code: | 16247 | ||||||||||||||||||||||||||||||||||||
Collection: | CaltechTHESIS | ||||||||||||||||||||||||||||||||||||
Deposited By: | Jennifer Sun | ||||||||||||||||||||||||||||||||||||
Deposited On: | 02 Dec 2023 00:58 | ||||||||||||||||||||||||||||||||||||
Last Modified: | 17 Jun 2024 18:26 |
Thesis Files
PDF
- Final Version
See Usage Policy. 43MB |
Repository Staff Only: item control page