Generalized spike-and-slab priors for bayesian group feature selection using expectation propagation

Hernández Lobato, Daniel; Hernández-Lobato, José Miguel; Dupont, Pierre

UAM_Biblioteca

Author

Hernández Lobato, Daniel

; Hernández-Lobato, José Miguel; Dupont, Pierre

Entity

UAM. Departamento de Ingeniería Informática

Publisher

MIT Press

Date

2013

Citation

Journal of Machine Learning Research 14 (2013): 1891-1945

ISSN

1533-7928 (online); 1532-4435 (print)

Funded by

Daniel Hernández-Lobato and Pierre Dupont acknowledge support from the Spanish Dirección General de Investigación, project ALLS (TIN2010-21575-C02-02).

Editor's Version

http://www.jmlr.org/papers/v14/hernandez-lobato13a.html

Subjects

Group feature selection; Generalized spike-and-slab priors; Expectation propagation; Sparse linear model; Approximate inference; Sequential experimental design; Signal reconstruction; Informática

URI

http://hdl.handle.net/10486/664106

Rights

@ 2013 Daniel Hernández-Lobato, José Miguel Hernández-Lobato and Pierre Dupont

Abstract

We describe a Bayesian method for group feature selection in linear regression problems. The method is based on a generalized version of the standard spike-and-slab prior distribution which is often used for individual feature selection. Exact Bayesian inference under the prior considered is infeasible for typical regression problems. However, approximate inference can be carried out efficiently using Expectation Propagation (EP). A detailed analysis of the generalized spike-and-slab prior shows that it is well suited for regression problems that are sparse at the group level. Furthermore, this prior can be used to introduce prior knowledge about specific groups of features that are a priori believed to be more relevant. An experimental evaluation compares the performance of the proposed method with those of group LASSO, Bayesian group LASSO, automatic relevance determination and additional variants used for group feature selection. The results of these experiments show that a model based on the generalized spike-and-slab prior and the EP algorithm has state-of-the-art prediction performance in the problems analyzed. Furthermore, this model is also very useful to carry out sequential experimental design (also known as active learning), where the data instances that are most informative are iteratively included in the training set, reducing the number of instances needed to obtain a particular level of prediction accuracy.

Show full item record

Files in this item

Name

generalized_hernandez-lobato_JMLR_2013.pdf

Size

1.580Mb

Format

PDF

Google™ Scholar:Hernández Lobato, Daniel - Hernández-Lobato, José Miguel - Dupont, Pierre

This item appears in the following Collection(s)

Producción científica en acceso abierto de la UAM [20411]

UAM_Biblioteca