Fritz Albregtsen (Institutt for informatikk): Parameter selection and error estimation in high dimensional spaces - From Occam through “Optimal Brain Damage” to low dimensionality spaces

Fredagskollokvium

Abstract

Parameters are extracted from signals, images, time sequences of images and spectra etc, in order to characterize objects or phenomena, or to differentiate between classes of objects or structures. Often, a multitude of parameters may be extracted, resulting in very high dimensional parameter spaces.

We address the problem of parameter selection to obtain a robust and reliable model, and the problem of unbiased error estimation, when the number of possible parameter candidates is large and the number of observations (samples) is limited.

Simulations demonstrate that in order to find the parameters that actually differ between classes, the necessary ratio of training samples to parameter candidates depends on the number of candidates, training samples and the distance between the classes. The effect of the number of candidates is mostly ignored, but is actually critical for small data sets. Furthermore, the error estimate may be a optimistically biased when parameter selection is performed on the same data as the error estimation. We illustrate with some suggested solutions from practical image analysis of typical data sets.

Publisert 10. aug. 2009 16:50 - Sist endret 15. juni 2011 13:49