Multiscale Methods, Parallel Computation, and Neural Networks for Real-Time Computer Vision

Citation

Battiti, Roberto (1990) Multiscale Methods, Parallel Computation, and Neural Networks for Real-Time Computer Vision. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/4q6t-2b57. https://resolver.caltech.edu/CaltechETD:etd-06072007-074441

Abstract

This thesis presents new algorithms for low and intermediate level computer vision.

The guiding ideas in the presented approach are those of hierarchical and adaptive processing, concurrent computation, and supervised learning.

Processing of the visual data at different resolutions is used not only to reduce the amount of computation necessary to reach the fixed point, but also to produce a more accurate estimation of the desired parameters. The presented adaptive multiple scale technique is applied to the problem of motion field estimation. Different parts of the image are analyzed at a resolution that is chosen in order to minimize the error in the coefficients of the differential equations to be solved. Tests with video-acquired images show that velocity estimation is more accurate over a wide range of motion with respect to the homogeneous scheme. In some cases introduction of explicit discontinuities coupled to the continuous variables can be used to avoid propagation of visual information from areas corresponding to objects with different physical and/or kinematic properties.

The human visual system uses concurrent computation in order to process the vast amount of visual data in "real-time." Although with different technological constraints, parallel computation can be used efficiently for computer vision. All the presented algorithms have been implemented on medium grain distributed memory multicomputers with a speed-up approximately proportional to the number of processors used. A simple two-dimensional domain decomposition assigns regions of the multiresolution pyramid to the different processors. The inter-processor communication needed during the solution process is proportional to the linear dimension of the assigned domain, so that efficiency is close to 100% if a large region is assigned to each processor.

Finally, learning algorithms are shown to be a viable technique to engineer computer vision systems for different applications starting from multiple-purpose modules. In the last part of the thesis a well known optimization method (the Broyden-Fletcher-Goldfarb-Shanno memoryless quasi-Newton method) is applied to simple classification problems and shown to be superior to the "error back-propagation" algorithm for numerical stability, automatic selection of parameters, and convergence properties.

Item Type:

Thesis (Dissertation (Ph.D.))

Subject Keywords:

(Computation and Neural Systems)

Degree Grantor:

California Institute of Technology

Division:

Biology

Major Option:

Computation and Neural Systems

Thesis Availability:

Public (worldwide access)

Research Advisor(s):

Fox, Geoffrey C.

Thesis Committee:

Hopfield, John J. (chair)
Allman, John Morgan
Fox, Geoffrey C.
Furmanski, Wojtek
Koch, Christof

Defense Date:

30 November 1989

Funders:

Funding Agency	Grant Number
Department of Energy (DOE)	DE-FG-03-85ER25009
NSF	IST-8700064
IBM	UNSPECIFIED

Record Number:

CaltechETD:etd-06072007-074441

Persistent URL:

https://resolver.caltech.edu/CaltechETD:etd-06072007-074441

DOI:

10.7907/4q6t-2b57

Default Usage Policy:

No commercial reproduction, distribution, display or performance rights in this work are provided.

ID Code:

2496

Collection:

CaltechTHESIS

Deposited By:

Imported from ETD-db

Deposited On:

13 Jun 2007

Last Modified:

31 Aug 2022 00:24

Thesis Files

PDF - Final Version
See Usage Policy.
5MB

Repository Staff Only: item control page