Bayesian methods for adaptive models

Citation

MacKay, David J.C. (1992) Bayesian methods for adaptive models. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/H3A1-WM07. https://resolver.caltech.edu/CaltechETD:etd-01042007-131447

Abstract

The Bayesian framework for model comparison and regularisation is demonstrated by studying interpolation and classification problems modelled with both linear and non-linear models. This framework quantitatively embodies 'Occam's razor'. Over-complex and under-regularised models are automatically inferred to be less probable, even though their flexibility allows them to fit the data better.

When applied to 'neural networks', the Bayesian framework makes possible (1) objective comparison of solutions using alternative network architectures; (2) objective stopping rules for network pruning or growing procedures; (3) objective choice of type of weight decay terms (or regularisers); (4) on-line techniques for optimising weight decay (or regularisation constant) magnitude; (5) a measure of the effective number of well-determined parameters in a model; (6) quantified estimates of the error bars on network parameters and on network output. In the case of classification models, it is shown that the careful incorporation of error bar information into a classifier's predictions yields improved performance.

Comparisons of the inferences of the Bayesian Framework with more traditional cross-validation methods help detect poor underlying assumptions in learning models.

The relationship of the Bayesian learning framework to 'active learning' is examined. Objective functions are discussed which measure the expected informativeness data measurements, in the context of both interpolation and classification problems.

The concepts and methods described in this thesis are quite general and will be applicable to other data modelling problems whether they involve regression, classification or density estimation.

Item Type:	Thesis (Dissertation (Ph.D.))
Subject Keywords:	active learning; classification; complexity control; evidence; hyperparameters; hypothesis testing; interpolation; Laplace's method; marginal likelihood; model comparison; multi-layer perceptron; neural networks; Occam's razor; prediction; regression; regularization; supervised learning
Degree Grantor:	California Institute of Technology
Major Option:	Computation and Neural Systems
Thesis Availability:	Public (worldwide access)
Thesis Committee:	Hopfield, John J. (chair)
Defense Date:	10 December 1991
Non-Caltech Author Email:	djcm1 (AT) cam.ac.uk
Record Number:	CaltechETD:etd-01042007-131447
Persistent URL:	https://resolver.caltech.edu/CaltechETD:etd-01042007-131447
DOI:	10.7907/H3A1-WM07
Default Usage Policy:	No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:	25
Collection:	CaltechTHESIS
Deposited By:	Imported from ETD-db
Deposited On:	04 Jan 2007
Last Modified:	21 Dec 2019 04:09

Thesis Files

Preview

PDF (MacKay_djc_1992.pdf) - Final Version
See Usage Policy.
9MB

Preview

PDF (MacKay_djc_1992_revised_by_author.pdf) - Final Version
See Usage Policy.
1MB

Repository Staff Only: item control page