A double pruning algorithm for classification ensembles
Entity
UAM. Departamento de Ingeniería InformáticaPublisher
Springer Berlin HeidelbergDate
2010Citation
10.1007/978-3-642-12127-2_11
Multiple Classifier Systems: 9th International Workshop, MCS 2010, Cairo, Egypt, April 7-9, 2010. Proceedings. Lecture Notes in Computer Science, Volumen 5997. Springer 2010. 104-113.
ISSN
0302-9743 (print); 1611-3349 (online)ISBN
978-3-642-12126-5 (print); 978-3-642-12127-2 (online)DOI
10.1007/978-3-642-12127-2_11Editor's Version
http://dx.doi.org/10.1007/978-3-642-12127-2_11Subjects
Ensemble pruning; Instance-based pruning; Ensemble learning; Decision trees; InformáticaNote
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-12127-2_11Proceedings of 9th International Workshop, MCS 2010, Cairo, Egypt, April 7-9, 2010.
Rights
© Springer-Verlag Berlin Heidelberg 2010Abstract
This article introduces a double pruning algorithm that can be used to reduce the storage requirements, speed-up the classification process and improve the performance of parallel ensembles. A key element in the design of the algorithm is the estimation of the class label that the ensemble assigns to a given test instance by polling only a fraction of its classifiers. Instead of applying this form of dynamical (instance-based) pruning to the original ensemble, we propose to apply it to a subset of classifiers selected using standard ensemble pruning techniques. The pruned subensemble is built by first modifying the order in which classifiers are aggregated in the ensemble and then selecting the first classifiers in the ordered sequence. Experiments in benchmark problems illustrate the improvements that can be obtained with this technique. Specifically, using a bagging ensemble of 101 CART trees as a starting point, only the 21 trees of the pruned ordered ensemble need to be stored in memory. Depending on the classification task, on average, only 5 to 12 of these 21 classifiers are queried to compute the predictions. The generalization performance achieved by this double pruning algorithm is similar to pruned ordered bagging and significantly better than standard bagging.
Files in this item
Google Scholar:Soto Martínez, Víctor
-
Martínez Muñoz, Gonzalo
-
Hernández Lobato, Daniel
-
Suárez González, Alberto
This item appears in the following Collection(s)
Related items
Showing items related by title, author, creator and subject.