Citation
Liu, Yang (2020) From Restoring Human Vision to Enhancing Computer Vision. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/sq58-z682. https://resolver.caltech.edu/CaltechTHESIS:06092020-120629159
Abstract
The central theme of this work is enabling vision, which includes two subtopics: restoring vision for blind humans, and enhancing computer vision models in visual recognition. Chapter 1 first provides a gentle introduction to relevant high level principles of human visual computations and summarizes two fundamental questions that vision answers: "what" and "where." Chapters 2, 3, and 4 contain three published projects that are anchored by those two fundamental questions.
Chapter 2 introduces a cognitive assistant to restore visual function for blind humans by focusing on an interface powered by audio augmented reality. The assistant communicates the "what" and "where" aspects of visual scenes by a combination of natural language and spatialized sound. We experimentally demonstrated that the assistant enables many aspects of visual functions for naive blind users.
Chapters 3 and 4 develop data augmentation methods to address the data inefficiency problem in neural network based computer visual recognition models. In Chapter 3, a 3D-simulation based data augmentation method is developed for improving the generalization of visual classification models for rare classes. In Chapter 4, a fast and efficient data augmentation method is developed for the newly formulated panoptic segmentation task. The method improves performance of state-of-the-art panoptic segmentation models and generalizes across dataset domains, sizes, model architectures, and backbones.
Item Type: | Thesis (Dissertation (Ph.D.)) | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subject Keywords: | Vision, computer vision, blind | |||||||||||||||
Degree Grantor: | California Institute of Technology | |||||||||||||||
Division: | Engineering and Applied Science | |||||||||||||||
Major Option: | Computation and Neural Systems | |||||||||||||||
Thesis Availability: | Public (worldwide access) | |||||||||||||||
Research Advisor(s): |
| |||||||||||||||
Thesis Committee: |
| |||||||||||||||
Defense Date: | 2 June 2020 | |||||||||||||||
Non-Caltech Author Email: | youngleoel (AT) outlook.com | |||||||||||||||
Record Number: | CaltechTHESIS:06092020-120629159 | |||||||||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechTHESIS:06092020-120629159 | |||||||||||||||
DOI: | 10.7907/sq58-z682 | |||||||||||||||
Related URLs: |
| |||||||||||||||
ORCID: |
| |||||||||||||||
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | |||||||||||||||
ID Code: | 13808 | |||||||||||||||
Collection: | CaltechTHESIS | |||||||||||||||
Deposited By: | Yang Liu | |||||||||||||||
Deposited On: | 09 Jun 2020 21:17 | |||||||||||||||
Last Modified: | 17 Jun 2020 19:34 |
Thesis Files
|
PDF
- Final Version
See Usage Policy. 19MB |
Repository Staff Only: item control page