dc.contributor.author | González Domínguez, Javier | |
dc.contributor.author | López Moreno, Ignacio | |
dc.contributor.author | Franco-Pedroso, Javier | |
dc.contributor.author | Ramos Castro, Daniel | |
dc.contributor.author | Toledano, Doroteo T. | |
dc.contributor.author | González Rodríguez, Joaquín | |
dc.contributor.other | UAM. Departamento de Tecnología Electrónica y de las Comunicaciones | es_ES |
dc.date.accessioned | 2015-05-07T11:23:13Z | |
dc.date.available | 2015-05-07T11:23:13Z | |
dc.date.issued | 2010-12-01 | |
dc.identifier.citation | IEEE Journal of Selected Topics in Signal Processing 4.6 (2010): 1084 – 1093 | en_US |
dc.identifier.issn | 1932-4553 (print) | en_US |
dc.identifier.issn | 1941-0484 (online) | en_US |
dc.identifier.uri | http://hdl.handle.net/10486/666039 | |
dc.description | Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. Gonzalez-Dominguez, I. Lopez-Moreno, J. Franco-Pedroso, D. Ramos, D. T. Toledano, and J. Gonzalez-Rodriguez, "Multilevel and Session Variability Compensated Language Recognition: ATVS-UAM Systems at NIST LRE 2009" IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 6, pp. 1084 – 1093, December 2010 | en_US |
dc.description.abstract | This work presents the systems submitted by the
ATVS Biometric Recognition Group to the 2009 Language Recognition
Evaluation (LRE’09), organized by NIST. New challenges
included in this LRE edition can be summarized by three main
differences with respect to past evaluations. Firstly, the number
of languages to be recognized expanded to 23 languages from 14
in 2007, and 7 in 2005. Secondly, the data variability has been
increased by including telephone speech excerpts extracted from
Voice of America (VOA) radio broadcasts through Internet in
addition to Conversational Telephone Speech (CTS). The third
difference was the volume of data, involving in this evaluation
up to 2 terabytes of speech data for development, which is an
order of magnitude greater than past evaluations. LRE’09 thus
required participants to develop robust systems able not only to
successfully face the session variability problem but also to do
it with reasonable computational resources. ATVS participation
consisted of state-of-the-art acoustic and high-level systems focussing
on these issues. Furthermore, the problem of finding a
proper combination and calibration of the information obtained
at different levels of the speech signal was widely explored in this
submission. In this work, two original contributions were developed.
The first contribution was applying a session variability
compensation scheme based on Factor Analysis (FA) within the
statistics domain into a SVM-supervector (SVM-SV) approach.
The second contribution was the employment of a novel backend
based on anchor models in order to fuse individual systems
prior to one-vs-all calibration via logistic regression. Results both
in development and evaluation corpora show the robustness and
excellent performance of the submitted systems, exemplified by
our system ranked 2nd in the 30 second open-set condition, with
remarkably scarce computational resources. | en_US |
dc.description.sponsorship | This work has been supported by the Spanish Ministry of Education under project TEC2006-13170-C02-01. Javier
Gonzalez-Dominguez also thanks Spanish Ministry of Education for supporting his doctoral research under project
TEC2006-13141-C03-03. Special thanks are given to Dr. David Van Leeuwen from TNO Human Factors (Utrech, The
Netherlands) for his strong collaboration, valuable discussions and ideas. Also, authors thank to Dr. Patrick Lucey for his
final support on (non-target) Australian English review of the manuscript. | en_US |
dc.format.extent | 11 pág. | es_ES |
dc.format.mimetype | application/pdf | en |
dc.language.iso | eng | en |
dc.publisher | IEEE | en_US |
dc.relation.ispartof | IEEE Journal on Selected Topics in Signal Processing | en_US |
dc.rights | © 2010 IEEE | en_US |
dc.subject.other | Anchor models | en_US |
dc.subject.other | Calibration | en_US |
dc.subject.other | Factor analysis (FA) | en_US |
dc.subject.other | Language recognition | en_US |
dc.subject.other | Linear scoring | en_US |
dc.subject.other | Sufficient statistics | en_US |
dc.title | Multilevel and session variability compensated language recognition: ATVS-UAM systems at NIST LRE 2009 | en_US |
dc.type | article | en_US |
dc.subject.eciencia | Telecomunicaciones | es_ES |
dc.relation.publisherversion | http://dx.doi.org/10.1109/JSTSP.2010.2076071 | |
dc.identifier.doi | 10.1109/JSTSP.2010.2076071 | |
dc.identifier.publicationfirstpage | 1084 | |
dc.identifier.publicationissue | 6 | |
dc.identifier.publicationlastpage | 1093 | |
dc.identifier.publicationvolume | 4 | |
dc.type.version | info:eu-repo/semantics/acceptedVersion | en |
dc.contributor.group | Análisis y Tratamiento de Voz y Señales Biométricas (ING EPS-002) | es_ES |
dc.rights.accessRights | openAccess | en |
dc.authorUAM | González Domínguez, Javier (261826) | |
dc.facultadUAM | Escuela Politécnica Superior | |