Citation
MacKay, David J.C. (1992) Bayesian methods for adaptive models. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/H3A1-WM07. https://resolver.caltech.edu/CaltechETD:etd-01042007-131447
Abstract
The Bayesian framework for model comparison and regularisation is demonstrated by studying interpolation and classification problems modelled with both linear and non-linear models. This framework quantitatively embodies 'Occam's razor'. Over-complex and under-regularised models are automatically inferred to be less probable, even though their flexibility allows them to fit the data better.
When applied to 'neural networks', the Bayesian framework makes possible (1) objective comparison of solutions using alternative network architectures; (2) objective stopping rules for network pruning or growing procedures; (3) objective choice of type of weight decay terms (or regularisers); (4) on-line techniques for optimising weight decay (or regularisation constant) magnitude; (5) a measure of the effective number of well-determined parameters in a model; (6) quantified estimates of the error bars on network parameters and on network output. In the case of classification models, it is shown that the careful incorporation of error bar information into a classifier's predictions yields improved performance.
Comparisons of the inferences of the Bayesian Framework with more traditional cross-validation methods help detect poor underlying assumptions in learning models.
The relationship of the Bayesian learning framework to 'active learning' is examined. Objective functions are discussed which measure the expected informativeness data measurements, in the context of both interpolation and classification problems.
The concepts and methods described in this thesis are quite general and will be applicable to other data modelling problems whether they involve regression, classification or density estimation.
Item Type: | Thesis (Dissertation (Ph.D.)) |
---|---|
Subject Keywords: | active learning; classification; complexity control; evidence; hyperparameters; hypothesis testing; interpolation; Laplace's method; marginal likelihood; model comparison; multi-layer perceptron; neural networks; Occam's razor; prediction; regression; regularization; supervised learning |
Degree Grantor: | California Institute of Technology |
Major Option: | Computation and Neural Systems |
Thesis Availability: | Public (worldwide access) |
Thesis Committee: |
|
Defense Date: | 10 December 1991 |
Non-Caltech Author Email: | djcm1 (AT) cam.ac.uk |
Record Number: | CaltechETD:etd-01042007-131447 |
Persistent URL: | https://resolver.caltech.edu/CaltechETD:etd-01042007-131447 |
DOI: | 10.7907/H3A1-WM07 |
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. |
ID Code: | 25 |
Collection: | CaltechTHESIS |
Deposited By: | Imported from ETD-db |
Deposited On: | 04 Jan 2007 |
Last Modified: | 21 Dec 2019 04:09 |
Thesis Files
|
PDF (MacKay_djc_1992.pdf)
- Final Version
See Usage Policy. 9MB | |
|
PDF (MacKay_djc_1992_revised_by_author.pdf)
- Final Version
See Usage Policy. 1MB |
Repository Staff Only: item control page