A Fisher consistent multiclass loss function with variable margin on positive examples
Entity
UAM. Departamento de Ingeniería InformáticaPublisher
Institute of Mathematical StatisticsDate
2015-08-19Citation
10.1214/15-EJS1073
Electronic Journal of Statistics 9.2 (2015): 2255-2292
ISSN
1935-7524DOI
10.1214/15-EJS1073Funded by
The authors acknowledge the referees' comments and suggestions that helped to improve the manuscript. This research is based upon work supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via the Federal Bureau of Investigations, Finance Division. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. I.R-L acknowledges partial support by Spain's grants TIN2013-42351-P (MINECO) and S2013/ICE-2845 CASI-CAM-CM (Comunidad de Madrid). The authors gratefully acknowledge the use of the facilities of Centro de Computacion Cientifica (CCC) at Universidad Autonoma de Madrid.Project
Gobierno de España. TIN2013-42351-P; Comunidad de Madrid. S2013/ICE-2845/CASI - CAMEditor's Version
http://dx.doi.org/10.1214/15-EJS1073Subjects
Bayes consistency; Classification calibration; Fisher consistency; Hinge loss functions; Multiclass classification; Support vector machine; InformáticaAbstract
The concept of pointwise Fisher consistency (or classification calibration) states necessary and sufficient conditions to have Bayes consistency when a classifier minimizes a surrogate loss function instead of the 0-1 loss. We present a family of multiclass hinge loss functions defined by a continuous control parameter. representing the margin of the positive points of a given class. The parameter. allows shifting from classification uncalibrated to classification calibrated loss functions. Though previous results suggest that increasing the margin of positive points has positive effects on the classification model, other approaches have failed to give increasing weight to the positive examples without losing the classification calibration property. Our lambda-based loss function can give unlimited weight to the positive examples without breaking the classification calibration property. Moreover, when embedding these loss functions into the Support Vector Machine's framework (lambda-SVM), the parameter. defines different regions for the Karush-Kuhn-Tucker conditions. A large margin on positive points also facilitates faster convergence of the Sequential Minimal Optimization algorithm, leading to lower training times than other classification calibrated methods. lambda-SVM allows easy implementation, and its practical use in different datasets not only supports our theoretical analysis, but also provides good classification performance and fast training times.
Files in this item
Google Scholar:Rodríguez-Luján, Irene
-
Huerta, Ramón
This item appears in the following Collection(s)
Related items
Showing items related by title, author, creator and subject.