Discriminative vs. Generative Object Recognition: Objects, Faces, and the Web

Citation

Holub, Alex David (2007) Discriminative vs. Generative Object Recognition: Objects, Faces, and the Web. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/2HC2-K923. https://resolver.caltech.edu/CaltechETD:etd-05312007-204007

Abstract

The ability to automatically identify and recognize objects in images remains one of the most challenging and potentially useful problems in computer vision. Despite significant progress over the past decade computers are not yet close to matching human performance. This thesis develops various machine learning approaches for improving the ability of computers to recognize object categories. In particular, it focuses on approaches which are able to distinguish between object categories which are visually similar to one another. Examples of similar visual object categories are motorcycles and bicycles, and lions and cougars. Distinguishing between similar object categories may require different algorithms than distinguishing between different categories. We explore two common machine learning paradigms, generative and discriminative learning, and analyze their respective abilities to distinguish between different sets of object categories. One set of object categories which we are exposed to on a daily basis are face images, and a significant portion of this thesis is spent analyzing different methods for accurately representing and discriminating between faces. We also address a key issue related to the discriminative learning paradigms, namely how to collect the large training set of images necessary to accurately learn discriminative models. In particular, we suggest a novel active learning which intelligently chooses the most informative image to label and thus drastically reduces (up to 10x) the time required to collect a training set. We validate and analyze our algorithms on large data-sets collected from the web and show how using hybrid generative-discriminative techniques can drastically outperform previous algorithms. In addition, we show how to use our techniques in practical applications such as finding similar-looking individuals within large data-sets of faces, discriminating between large sets of visual categories, and increasing the efficiency and speed of web-image searchi

Item Type:	Thesis (Dissertation (Ph.D.))
Subject Keywords:	computer vision; machine learning; object recognition; statistical learning
Degree Grantor:	California Institute of Technology
Division:	Engineering and Applied Science
Major Option:	Computation and Neural Systems
Thesis Availability:	Public (worldwide access)
Research Advisor(s):	Perona, Pietro
Thesis Committee:	Perona, Pietro (chair) Welling, Max Burl, Michael C. Shimojo, Shinsuke Abu-Mostafa, Yaser S.
Defense Date:	30 April 2007
Record Number:	CaltechETD:etd-05312007-204007
Persistent URL:	https://resolver.caltech.edu/CaltechETD:etd-05312007-204007
DOI:	10.7907/2HC2-K923
Default Usage Policy:	No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:	2344
Collection:	CaltechTHESIS
Deposited By:	Imported from ETD-db
Deposited On:	04 Jun 2007
Last Modified:	08 Nov 2023 00:44

Thesis Files

Preview

PDF (holub_thesis.pdf) - Final Version
See Usage Policy.
17MB

Repository Staff Only: item control page