A Caltech Library Service

Methods for the Analysis of Visual Motion


Feng, Xiaolin (2002) Methods for the Analysis of Visual Motion. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/SAVS-VJ68.


Vision is a primary sense that allows human beings to interact with their environment and motion is one of the most important cues that vision can explore and utilize. In this thesis, we present computational approaches to the problems of inferring three-dimensional motion information and perceiving two-dimensional human motions from a sequence of images captured by a camera. The three-dimensional structure of world can be represented by distinguishable features, such as points. Assume all the features move under the same rigid motion in space, this motion can be recovered from the projections of the features in three views by solving a set of trilinear constraints. The trilinear constraints have been considered only as algebraic equations so that their satisfactory performance in motion estimation is not easy to understand. This thesis solves this puzzle by discovering a geometrical interpretation of trilinear constraints. It is showed that those algebraic equations correspond to depth errors appropriately weighted by a function of the relative reliability of the corresponding measurements. When the assumption is relaxed to allowing features to move under different rigid motions, this thesis proposes a three-dimensional motion based expectation-maximization algorithm combined with the modified separation matrix scheme to cluster the features undergoing the same motion into a group and estimate the motion for every group at the same time. The problem of detecting and recognizing human motions arises from many applications in computer vision. This thesis describes an algorithm to detect human body from their motion patterns in a pair of frames which is based on learning an approximate probabilistic model of the positions and velocities of body joints. It then presents a scheme to recognize human actions in a sequence of frames assuming the human body is detected. This scheme enables us to simultaneously recognize both the action and the body poses in the observed sequence. All our theoretical work is supported by experimental results.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:Electrical Engineering
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Electrical Engineering
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Perona, Pietro
Thesis Committee:
  • Perona, Pietro (chair)
  • Burdick, Joel Wakeman
  • Koch, Christof
  • Psaltis, Demetri
Defense Date:8 May 2002
Record Number:CaltechTHESIS:10072010-140426638
Persistent URL:
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:6118
Deposited By: Dan Anguka
Deposited On:07 Oct 2010 21:52
Last Modified:08 Nov 2023 00:44

Thesis Files

PDF - Final Version
See Usage Policy.


Repository Staff Only: item control page