Citation
Çataltepe, Zehra Kök (1998) Incorporating Input Information into Learning and Augmented Objective Functions. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/82JV-3D67. https://resolver.caltech.edu/CaltechETD:etd-10042005-104636
Abstract
In many applications, some form of input information, such as test inputs or extra inputs, is available. We incorporate input information into learning by an augmented error function, which is an estimator of the out-of-sample error. The augmented error consists of the training error plus an additional term scaled by the augmentation parameter. For general linear models, we analytically show that the augmented solution has smaller out-of-sample error than the least squares solution. For nonlinear models, we devise an algorithm to minimize the augmented error by gradient descent, determining the augmentation parameter using cross validation.
Augmented objective functions also arise when hints are incorporated into learning. We first show that using the invariance hints to estimate the test error, and early stopping on this estimator, results in better solutions than the minimization of the training error. We also extend our algorithm for incorporating input information to the case of learning from hints.
Input information or hints are additional information about the target function. When the only available information is the training set, all the models with the same training error are equally likely to be the target. In that case, we show that early stopping of training at any training error level above the minimum can not decrease the out-of-sample error. Our results are nonasymptotic for general linear models and the bin model, and asymptotic for nonlinear models. When additional information is available, early stopping can help.
Item Type: | Thesis (Dissertation (Ph.D.)) | ||||
---|---|---|---|---|---|
Subject Keywords: | augmented error function; early stopping of training; generalization error estimate; hints; learning from hints; NFL; no free lunch theorem; test error estimate | ||||
Degree Grantor: | California Institute of Technology | ||||
Division: | Engineering and Applied Science | ||||
Major Option: | Computer Science | ||||
Thesis Availability: | Public (worldwide access) | ||||
Research Advisor(s): |
| ||||
Thesis Committee: |
| ||||
Defense Date: | 18 May 1998 | ||||
Non-Caltech Author Email: | cataltepe (AT) itu.edu.tr | ||||
Record Number: | CaltechETD:etd-10042005-104636 | ||||
Persistent URL: | https://resolver.caltech.edu/CaltechETD:etd-10042005-104636 | ||||
DOI: | 10.7907/82JV-3D67 | ||||
ORCID: |
| ||||
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||
ID Code: | 3913 | ||||
Collection: | CaltechTHESIS | ||||
Deposited By: | Imported from ETD-db | ||||
Deposited On: | 04 Oct 2005 | ||||
Last Modified: | 10 Nov 2020 22:56 |
Thesis Files
|
PDF (Cataltepe_z_1998.pdf)
- Final Version
See Usage Policy. 8MB |
Repository Staff Only: item control page