Citation
Chen, Bo (2016) Quantum of Vision. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/Z9057CWS. https://resolver.caltech.edu/CaltechTHESIS:05252016-140422108
Abstract
Visual inputs to artificial and biological visual systems are often quantized: cameras accumulate photons from the visual world, and the brain receives action potentials from visual sensory neurons. Collecting more information quanta leads to a longer acquisition time and better performance. In many visual tasks, collecting a small number of quanta is sufficient to solve the task well. The ability to determine the right number of quanta is pivotal in situations where visual information is costly to obtain, such as photon-starved or time-critical environments. In these situations, conventional vision systems that always collect a fixed and large amount of information are infeasible. I develop a framework that judiciously determines the number of information quanta to observe based on the cost of observation and the requirement for accuracy. The framework implements the optimal speed versus accuracy tradeoff when two assumptions are met, namely that the task is fully specified probabilistically and constant over time. I also extend the framework to address scenarios that violate the assumptions. I deploy the framework to three recognition tasks: visual search (where both assumptions are satisfied), scotopic visual recognition (where the model is not specified), and visual discrimination with unknown stimulus onset (where the model is dynamic over time). Scotopic classification experiments suggest that the framework leads to dramatic improvement in photon-efficiency compared to conventional computer vision algorithms. Human psychophysics experiments confirmed that the framework provides a parsimonious and versatile explanation for human behavior under time pressure in both static and dynamic environments.
Item Type: | Thesis (Dissertation (Ph.D.)) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Subject Keywords: | quantized vision, deep learning, scotopic vision, visual search, SPRT | |||||||||
Degree Grantor: | California Institute of Technology | |||||||||
Division: | Engineering and Applied Science | |||||||||
Major Option: | Computation and Neural Systems | |||||||||
Thesis Availability: | Public (worldwide access) | |||||||||
Research Advisor(s): |
| |||||||||
Thesis Committee: |
| |||||||||
Defense Date: | 12 May 2016 | |||||||||
Non-Caltech Author Email: | bochen.caltech (AT) gmail.com | |||||||||
Funders: |
| |||||||||
Record Number: | CaltechTHESIS:05252016-140422108 | |||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechTHESIS:05252016-140422108 | |||||||||
DOI: | 10.7907/Z9057CWS | |||||||||
Related URLs: |
| |||||||||
ORCID: |
| |||||||||
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | |||||||||
ID Code: | 9754 | |||||||||
Collection: | CaltechTHESIS | |||||||||
Deposited By: | Bo Chen | |||||||||
Deposited On: | 26 May 2016 19:19 | |||||||||
Last Modified: | 08 Nov 2023 00:44 |
Thesis Files
|
PDF (Full thesis)
- Final Version
See Usage Policy. 4MB | |
|
PDF (Front-matter + Chapter 1-3)
- Final Version
See Usage Policy. 2MB | |
|
PDF (Chapter 4)
- Final Version
See Usage Policy. 1MB | |
|
PDF (Chapter 5)
- Final Version
See Usage Policy. 1MB | |
|
PDF (Chapter 6-7 + Appendix)
- Final Version
See Usage Policy. 646kB |
Repository Staff Only: item control page