Battiti, Roberto (1990) Multiscale methods, parallel computation, and neural networks for real-time computer vision. Dissertation (Ph.D.), California Institute of Technology. http://resolver.caltech.edu/CaltechETD:etd-06072007-074441
This thesis presents new algorithms for low and intermediate level computer vision.
The guiding ideas in the presented approach are those of hierarchical and adaptive processing, concurrent computation, and supervised learning.
Processing of the visual data at different resolutions is used not only to reduce the amount of computation necessary to reach the fixed point, but also to produce a more accurate estimation of the desired parameters. The presented adaptive multiple scale technique is applied to the problem of motion field estimation. Different parts of the image are analyzed at a resolution that is chosen in order to minimize the error in the coefficients of the differential equations to be solved. Tests with video-acquired images show that velocity estimation is more accurate over a wide range of motion with respect to the homogeneous scheme. In some cases introduction of explicit discontinuities coupled to the continuous variables can be used to avoid propagation of visual information from areas corresponding to objects with different physical and/or kinematic properties.
The human visual system uses concurrent computation in order to process the vast amount of visual data in "real-time." Although with different technological constraints, parallel computation can be used efficiently for computer vision. All the presented algorithms have been implemented on medium grain distributed memory multicomputers with a speed-up approximately proportional to the number of processors used. A simple two-dimensional domain decomposition assigns regions of the multiresolution pyramid to the different processors. The inter-processor communication needed during the solution process is proportional to the linear dimension of the assigned domain, so that efficiency is close to 100% if a large region is assigned to each processor.
Finally, learning algorithms are shown to be a viable technique to engineer computer vision systems for different applications starting from multiple-purpose modules. In the last part of the thesis a well known optimization method (the Broyden-Fletcher-Goldfarb-Shanno memoryless quasi-Newton method) is applied to simple classification problems and shown to be superior to the "error back-propagation" algorithm for numerical stability, automatic selection of parameters, and convergence properties.
|Item Type:||Thesis (Dissertation (Ph.D.))|
|Degree Grantor:||California Institute of Technology|
|Division:||Engineering and Applied Science|
|Major Option:||Computation and Neural Systems|
|Thesis Availability:||Restricted to Caltech community only|
|Defense Date:||30 November 1989|
|Default Usage Policy:||No commercial reproduction, distribution, display or performance rights in this work are provided.|
|Deposited By:||Imported from ETD-db|
|Deposited On:||13 Jun 2007|
|Last Modified:||26 Dec 2012 02:52|
- Final Version
Restricted to Caltech community only
See Usage Policy.
Repository Staff Only: item control page