Formant trajectories in linguistic units for text-independent speaker recognition

Franco-Pedroso, Javier; Espinoza Cuadros, Fernando Manuel; González Rodríguez, Joaquín

UAM_Biblioteca

Mañana, JUEVES, 24 DE ABRIL, el sistema se apagará debido a tareas habituales de mantenimiento a partir de las 9 de la mañana. Lamentamos las molestias.

Author

Franco-Pedroso, Javier; Espinoza Cuadros, Fernando Manuel; González Rodríguez, Joaquín

Entity

UAM. Departamento de Tecnología Electrónica y de las Comunicaciones

Publisher

IEEE

Date

2013

Citation

2013 International Conference on Biometrics (ICB). IEEE, 2013, 1-6

ISBN

978-1-4799-0310-8

DOI

10.1109/ICB.2013.6613001

Funded by

Supported by MEC grant PR-2010-123, MICINN project TEC09-14179, ForBayes project CCG10-UAM/TIC-5792 and Cátedra UAM-Telefónica.

Editor's Version

http://dx.doi.org/10.1109/ICB.2013.6613001

Subjects

Feature extraction; Natural language processing; Speaker recognition; Vectors; Telecomunicaciones

URI

http://hdl.handle.net/10486/664796

Note

Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. Franco-Pedroso, F. Espinoza-Cuadros, J. González-Rodríguez, "Formant trajectories in linguistic units for text-independent speaker recognition" in International Conference on Biometrics (ICB), Madrid (Spain), 2013, 1-6

Rights

Abstract

Inspired by successful work in forensic speaker identification, this work presents a higher level system for text-independent speaker recognition by means of the temporal trajectories of formant frequencies in linguistic units. Feature extraction from unit-dependent trajectories provides a very flexible system able to be applied in different scenarios. At a fine-grained level, it is possible to provide a calibrated likelihood ratio per linguistic unit under analysis (extremely useful in applications such as forensics), and at a coarse-grained level, the individual contributions of different units can be combined to obtain a more discriminative single system with high potential for combination with short term spectral systems. With development data being extracted from NIST SRE 2004 and 2005 datasets, this approach has been tested on NIST SRE 2006 1side-1side task, English-only male trials, consisting of 9,720 trials from 219 speakers. Remarkable results have been obtained for some single units from extremely short segments of speech, and the combination of several units leads to a relative improvement of 17.2% on EER when fusing with an i-vector system.

Show full item record

Files in this item

Name

formant_franco-pedroso_ICB_2013_ps.pdf

Size

494.8Kb

Format

PDF

Google™ Scholar:Franco-Pedroso, Javier - Espinoza Cuadros, Fernando Manuel - González Rodríguez, Joaquín

This item appears in the following Collection(s)

Producción científica en acceso abierto de la UAM [20370]

UAM_Biblioteca