A Caltech Library Service

Learning-Based Perception for Robotics in Suboptimal Data Landscapes


Lee, Connor Tinghan (2024) Learning-Based Perception for Robotics in Suboptimal Data Landscapes. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/v4yf-pj25.


Autonomous robots are increasingly present in the world today, being used across a variety of settings and applications. In order to interact with their surroundings, robots typically use cameras to see the world, employing computer vision algorithms to comprehend rich, visual information. While contemporary, learning-based computer vision models provide robots with an accurate and robust understanding of their surroundings, most off-the-shelf methods rely on supervised deep learning techniques, requiring abundant labeled data in order to train and prevent overfitting. However, in many robotic applications and settings, the data landscape is characterized by data scarcity and/or the lack of apparent supervisory signals. Since custom perception solutions are often required for robotic applications, direct adoption of common computer vision methods proves challenging.

In this thesis, we develop robotic perception approaches across three different applications that overcome the challenges of such data landscapes. First, we develop learning-based visual terrain-relative navigation (VTRN) approaches for high-altitude aerial vehicles. This is a problem for which relevant data is available, but made difficult by the lack of obvious supervisory signals related to the high-level navigation objective. In the first chapters of the thesis, we show the power of self-supervised learning approaches to increase VTRN robustness to seasonal and temporal variations that would otherwise debilitate such systems.

Next, we address the challenge of developing thermal semantic perception algorithms for aerial field robotics. Due to the specialized nature of field environments and the sensing modality, development of thermal vision algorithms under these conditions is often characterized by the lack of relevant data. We show how we develop various thermal semantic segmentation in response to the evolving data constraints inherent in field robotic projects. In the final part of the thesis, we develop data-efficient, multispectral deep learning algorithms for autonomous driving applications where the lack of data arises from the need for custom, multispectral datasets that are synchronized and coregistered.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:Machine learning, Computer Vision, Robotics
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Space Engineering
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Chung, Soon-Jo
Thesis Committee:
  • Watkins, Michael M. (chair)
  • Gkioxari, Georgia
  • Hadaegh, Fred Y.
  • Chung, Soon-Jo
Defense Date:22 May 2024
Record Number:CaltechTHESIS:06032024-181240783
Persistent URL:
Related URLs:
URLURL TypeDescription used in Chapter 2 DocumentMaterial used in Chapter 3 used in Chapter 4 used in Chapter 5 used in Chapter 6 used in Chapter 7 used in Chapter 8
Lee, Connor Tinghan0000-0002-5008-4092
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:16485
Deposited By: Connor Lee
Deposited On:06 Jun 2024 22:04
Last Modified:08 Jul 2024 19:10

Thesis Files

[img] PDF - Final Version
See Usage Policy.


Repository Staff Only: item control page