Neuronale Informationsverarbeitung (NI)
Research Teaching Publications Members Calendar

Browse all publications by topic

Browse all publications by year


  • S. Hochreiter and K. Obermayer. Gene Selection for Microarray Data. . In B. Schölkopf, K. Tsuda, and J.-P. Vert, editors, Kernel Methods in Computational Biology, pages 319-356. MIT Press, Cambridge, Massachusetts, 2004.
    (FTP Gzipped PostScript, 52 pages, 374 kb)
    In this chapter we discuss methods for gene selection on data obtained from the microarray technique. Gene selection is very important for microarray data, (a) as a preprocessing step to improve the performance of classifiers or other predictors for sample attributes, (b) in order to discover relevant genes, that is genes which show specific expression patterns cross the given set of samples, and (c) to save costs, for example if the microarray technique is used for diagnostic purposes. We introducea new feature selection method which is based on the support vector machine technique. The new feature selection method extracts a sparse set of genes, whose expression levels are important for predicting the class of a sample (for example ``positive?? vs. ``negative?? therapy outcome for tumor samples from patients). For this purpose the support vector technique is used in a novel way: instead of constructing a classifier from a minimal set of most informative samples (the so-called support vectors), the classifier is constructed using a minimal set of most informative features. In contrast to previously proposed methods, however, features rather than samples now formally assume the role of support vectors. We introduce a protocol for preprocessing, feature selection and evaluation of microarray data.Using this protocol we demonstrate the superior performance of our feature selection method on data sets obtained from patients with certain types of cancer (brain tumor, lymphoma, and breast cancer), where the outcome of a chemo- or radiation therapy must be predicted based on the gene expression profile. The feature selection method extracts genes (the so-called support genes) which are correlated with therapy outcome. For classifiers based on these genes, generalization performance is improved compared to previously proposed methods.