Browse all publications by topic
Browse all publications by year
- S. Hochreiter and
K. Obermayer. Feature Selection and Classification on Matrix Data: from
Large Margins To Small Covering Numbers.
.
In Advances in Neural Information Processing Systems 15, pages
913-920, Cambridge, Massachusetts, 2003. MIT Press.
(FTP Gzipped PostScript, 8 pages, 164 kb)
We investigate the problem of learning a classification task for
datasets which are described by matrices. Rows and columns of these matrices
correspond to objects, where row and column objects may belong to different
sets, and the entries in the matrix express the relationships between them.
We interpret the matrix elements as being produced by an unknown kernel which
operates on object pairs and we show that - under mild assumptions - these
kernels correspond to dot products in some (unknown) feature space.
Minimizing a bound for the generalization error of a linear classifier which
has been obtained using covering numbers we derive an objective function for
model selection according to the principle of structural risk minimization.
The new objective function has the advantage that it allows the analysis of
matrices which are not positive definite, and not even symmetric or square.
We then consider the case that row objects are interpreted as features. We
suggest an additional constraint, which imposes sparseness on the row objects
and show, that the method can then be used for feature selection. Finally, we
apply this method to data obtained from DNA microarrays, where
``column objects correspond to samples, ``row objects
correspond to genes and matrix elements correspond to expression levels.
Benchmarks are conducted using standard one-gene classification and support
vector machines and K-nearest neighbors after standard feature selection. Our
new method extracts a sparse set of genes and provides superior
classification results.
|