Supervised learning in probabilistic environments

Citation

Magdon-Ismail, Malik (1998) Supervised learning in probabilistic environments. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/6Y8S-4442. https://resolver.caltech.edu/CaltechETD:etd-09232005-143548

Abstract

NOTE: Text or symbols not renderable in plain ASCII are indicated by [...]. Abstract is included in .pdf document. For a wide class of learning systems and different noise models, we bound the test performance in terms of the noise level and number of data points. We obtain O(1/N) convergence to the best hypothesis, the rate of convergence depending on the noise level and target complexity with respect to the learning model. Our results can be applied to estimate the model limitation, which we illustrate in the financial markets. Changes in model limitation can be used to track changes in volatility. We analyze regularization in generalized linear models, focusing on weight decay. For a well specified linear model, the optimal regularization parameter decreases as [...]. When the data is noiseless, regularization is harmful. For a misspecified linear model, the "degree" of misspecification has an effect analogous to noise. For more general learning systems we develop EXPLOVA (explanation of variance) which also enables us to derive a condition on the learning model for regularization to help. We emphasize the necessity of prior information for effective regularization. By counting functions on a discretized grid, we develop a framework for incorporating prior knowledge about the target function into the learning process. Using this framework, we derive a direct connection between smoothness priors and Tikhonov regularization, in addition to the regularization terms implied by other priors. We prove a No Free Lunch result for noise prediction: when the prior over target functions is uniform, the data set conveys no information about the noise distribution. We then consider using maximum likelihood to predict non-stationary noise variance in time series. Maximum likelihood leads to systematic errors that favor lower variance. We discuss the systematic correction of these errors. We develop stochastic and deterministic techniques for density estimation based on approximating the distribution function, thus placing density estimation within the supervised learning framework. We prove consistency of the estimators and obtain convergence rates in L1 and L2. We also develop approaches to random variate generation based on "inverting" the density estimation procedure and based on a control formulation. In general, we use multilayer neural networks to illustrate our methods.

Item Type:	Thesis (Dissertation (Ph.D.))
Subject Keywords:	electrical engineering
Degree Grantor:	California Institute of Technology
Division:	Engineering and Applied Science
Major Option:	Electrical Engineering
Awards:	Charles and Ellen Wilts Prize, 1998
Thesis Availability:	Public (worldwide access)
Research Advisor(s):	Abu-Mostafa, Yaser S.
Thesis Committee:	Unknown, Unknown
Defense Date:	19 May 1998
Record Number:	CaltechETD:etd-09232005-143548
Persistent URL:	https://resolver.caltech.edu/CaltechETD:etd-09232005-143548
DOI:	10.7907/6Y8S-4442
Default Usage Policy:	No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:	3728
Collection:	CaltechTHESIS
Deposited By:	Imported from ETD-db
Deposited On:	26 Sep 2005
Last Modified:	21 Dec 2019 04:07

Thesis Files

Preview

PDF (Magdon_m_1998.pdf) - Final Version
See Usage Policy.
8MB

Repository Staff Only: item control page