Speaker recognition using temporal contours in linguistic units: the case of formant and formant-bandwidth trajectories

Biblos-e Archivo/Manakin Repository

Show simple item record

dc.contributor.author González-Rodríguez, Joaquín
dc.contributor.other UAM. Departamento de Tecnología Electrónica y de las Comunicaciones es_ES
dc.date.accessioned 2015-01-29T17:57:50Z
dc.date.available 2015-01-29T17:57:50Z
dc.date.issued 2011-08
dc.identifier.citation 12th Annual Conference of the International Speech Communication Association. Ed. by Piero Cosi, Renato De Mori, Giuseppe Di Fabbrizio, and Roberto Pieraccini. August 27-31, 2011 en_US
dc.identifier.issn 2308-457X
dc.identifier.uri http://hdl.handle.net/10486/663470
dc.description Proceedings of Interspeech 2011, Florence (Italy) en_US
dc.description.abstract We describe a new approach to automatic speaker recognition based in explicit modeling of temporal contours in linguistic units (TCLU). Inspired in successful work in forensic speaker identification, we extend the approach to design a fully automatic system, with a high potential for combination with spectral systems. Using SRI's Decipher phone, word and syllabic labels, we have tested up to 468 unit-based subsystems from 6 groups of lexically-determined units, namely phones, diphones, triphones, center phone in triphones, syllables and words, subsystems being combined at the score level. Evaluating with NIST SRE04 English-only 1s1s, their hierarchical fusion gives an EER of 4.20% (minDCF=0.018) from automatic formant tracking of conversational telephone speech. Combining extremely well with a Joint Factor Analysis system (from JFA EER of 4.25% to 2.47%, minDCF from 0.020 to 0.012), extensions as more robust prosodic or spectral features are likely to further improve this approach. en_US
dc.description.sponsorship This work has been supported by MEC research stay grant PR-2010-123, MICINN project TEC09-14179, ForBayes project CCG10-UAM/TIC-5792 and Catedra UAM-Telefonica. en_US
dc.format.extent 4 pag. es_ES
dc.format.mimetype application/pdf en
dc.language.iso eng en
dc.publisher International Speech Communication Association en_US
dc.relation.ispartof Interspeech en_US
dc.rights © 2011 ISCA en_US
dc.subject.other speaker recognition en_US
dc.subject.other linguistic units en_US
dc.subject.other temporal trajectories en_US
dc.subject.other formants en_US
dc.subject.other bandwidths en_US
dc.title Speaker recognition using temporal contours in linguistic units: the case of formant and formant-bandwidth trajectories en_US
dc.type conferenceObject en
dc.subject.eciencia Informática es_ES
dc.subject.eciencia Telecomunicaciones es_ES
dc.relation.publisherversion http://www.isca-speech.org/archive/interspeech_2011/i11_0133.html
dc.identifier.publicationfirstpage 133
dc.identifier.publicationlastpage 136
dc.relation.eventdate August 27-31, 2011 en_US
dc.relation.eventnumber 12
dc.relation.eventplace Florence (Italy) en_US
dc.relation.eventtitle 12th Annual Conference of the International Speech Communication Association (Interspeech 2011) en_US
dc.type.version info:eu-repo/semantics/publishedVersion en
dc.contributor.group Análisis y Tratamiento de Voz y Señales Biométricas (ING EPS-002) es_ES
dc.rights.accessRights openAccess en

Files in this item


This item appears in the following Collection(s)

Show simple item record