Citation
Greenspan, Hayit (1994) Multi-resolution image processing and learning for texture recognition and image enhancement. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/KR83-J714. https://resolver.caltech.edu/CaltechETD:etd-10192005-104013
Abstract
A general recognition framework is presented that consists of multi-resolution pyramidal feature-extraction and learning paradigms for classification. The system is presented in the context of the texture recognition task. In the feature extraction part of the system, an oriented Laplacian pyramid is used as an efficient filtering scheme to transform the input image to a more robust representation in the frequency and orientation space. An optimal technique is presented for computing a steerable representation of the pyramid. Steerability is used to generate a rotation-invariant input representation. In the learning stage of the system we focus on a rule-based probabilistic learning scheme. This information-theoretic technique is utilized to find the most informative correlations between the attributes and the output classes while producing probability estimates for the outputs. Both unsupervised and supervised learning are utilized. Apart from the rule-based approach we experiment with other non-parametric classifiers, such as the k-nearest neighbor classifier and the Backprop neural-network. We demonstrate experimentally that our scheme improves significantly upon the state-of-the-art both in rotation-invariant classification and in orientation estimation. A variety of applications are presented, including autonomous navigation scenarios and remote-sensing, as possible extensions for the texture recognition system. A generalization of the system to face-recognition is discussed. In the latter part of the thesis, a procedure for creating images with higher resolution than the sampling rate would allow is described. The enhancement algorithm augments the frequency content of the image by using a non-linearity that generates phase-coherent higher harmonics. The procedure utilizes the Laplacian pyramid image representation. Results are presented depicting the power-spectra augmentation and the visual enhancement of several images. Simplicity of computations and ease of implementation allow for real-time applications such as high-definition television (HDTV). An initial investigation is pursued to combine the enhancement scheme with pyramid coding schemes.
Item Type: | Thesis (Dissertation (Ph.D.)) |
---|---|
Subject Keywords: | (Electrical Engineering) |
Degree Grantor: | California Institute of Technology |
Division: | Engineering and Applied Science |
Major Option: | Electrical Engineering |
Thesis Availability: | Public (worldwide access) |
Research Advisor(s): |
|
Thesis Committee: |
|
Defense Date: | 20 May 1994 |
Record Number: | CaltechETD:etd-10192005-104013 |
Persistent URL: | https://resolver.caltech.edu/CaltechETD:etd-10192005-104013 |
DOI: | 10.7907/KR83-J714 |
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. |
ID Code: | 4174 |
Collection: | CaltechTHESIS |
Deposited By: | Imported from ETD-db |
Deposited On: | 19 Oct 2005 |
Last Modified: | 31 Aug 2022 00:16 |
Thesis Files
|
PDF (Greenspan_h_1994.pdf)
- Final Version
See Usage Policy. 15MB |
Repository Staff Only: item control page