Incorporating Input Information into Learning and Augmented Objective Functions

Citation

Çataltepe, Zehra Kök (1998) Incorporating Input Information into Learning and Augmented Objective Functions. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/82JV-3D67. https://resolver.caltech.edu/CaltechETD:etd-10042005-104636

Abstract

In many applications, some form of input information, such as test inputs or extra inputs, is available. We incorporate input information into learning by an augmented error function, which is an estimator of the out-of-sample error. The augmented error consists of the training error plus an additional term scaled by the augmentation parameter. For general linear models, we analytically show that the augmented solution has smaller out-of-sample error than the least squares solution. For nonlinear models, we devise an algorithm to minimize the augmented error by gradient descent, determining the augmentation parameter using cross validation.

Augmented objective functions also arise when hints are incorporated into learning. We first show that using the invariance hints to estimate the test error, and early stopping on this estimator, results in better solutions than the minimization of the training error. We also extend our algorithm for incorporating input information to the case of learning from hints.

Input information or hints are additional information about the target function. When the only available information is the training set, all the models with the same training error are equally likely to be the target. In that case, we show that early stopping of training at any training error level above the minimum can not decrease the out-of-sample error. Our results are nonasymptotic for general linear models and the bin model, and asymptotic for nonlinear models. When additional information is available, early stopping can help.

Item Type:

Thesis (Dissertation (Ph.D.))

Subject Keywords:

augmented error function; early stopping of training; generalization error estimate; hints; learning from hints; NFL; no free lunch theorem; test error estimate

Degree Grantor:

California Institute of Technology

Division:

Engineering and Applied Science

Major Option:

Computer Science

Thesis Availability:

Public (worldwide access)

Research Advisor(s):

Abu-Mostafa, Yaser S.

Thesis Committee:

Unknown, Unknown

Defense Date:

18 May 1998

Non-Caltech Author Email:

cataltepe (AT) itu.edu.tr

Record Number:

CaltechETD:etd-10042005-104636

Persistent URL:

https://resolver.caltech.edu/CaltechETD:etd-10042005-104636

DOI:

10.7907/82JV-3D67

ORCID:

Author	ORCID
Çataltepe, Zehra Kök	0000-0002-9742-5907

Default Usage Policy:

No commercial reproduction, distribution, display or performance rights in this work are provided.

ID Code:

3913

Collection:

CaltechTHESIS

Deposited By:

Imported from ETD-db

Deposited On:

04 Oct 2005

Last Modified:

10 Nov 2020 22:56

Thesis Files

Preview

PDF (Cataltepe_z_1998.pdf) - Final Version
See Usage Policy.
8MB

Repository Staff Only: item control page