dc.contributor.author | Liu, Chao | |
dc.contributor.author | Wang, Dong | |
dc.contributor.author | Tejedor Noguerales, Javier | |
dc.contributor.other | UAM. Departamento de Tecnología Electrónica y de las Comunicaciones | es_ES |
dc.date.accessioned | 2015-05-28T10:17:13Z | |
dc.date.available | 2015-05-28T10:17:13Z | |
dc.date.issued | 2012 | |
dc.identifier.citation | INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association, ISCA, 2012. 2093-2096 | en_US |
dc.identifier.issn | 1990-9772 | |
dc.identifier.uri | http://hdl.handle.net/10486/666456 | |
dc.description.abstract | An efficient indexing scheme is essentially important
for spoken term detection (STD) on large databases, particularly
for phone-based systems that have been widely
adopted to achieve vocabulary-independent detection.
While the finite state transducer (FST) composition provides
a standard indexing approach, the n-gram reverse
indexing is more flexible in connectivity representation
and confidence measuring and therefore may result in
better performance than searching within the original lattices
or the equivalent FSTs.
In this paper we present an n-gram FST indexing approach
which combines the flexibility of n-gram indexing
and the efficiency of FST indexing. Specifically, we
employ the n-gram indexing to relax connectivity in original
lattices and then formalize the indices into an FST
for online search. We demonstrate this approach with a
phone-based STD task where the lattice is sparse due to
strong language models. The results show that n-gram
FST indexing provides not only better detection performance
than lattice search, but also a faster detection than
both conventional n-gram and FST indexing.
Index Terms: spoken term indexing, finite state transducer,
spoken term detection, speech recognition | en_US |
dc.format.extent | 4 pág. | es_ES |
dc.format.mimetype | application/pdf | en |
dc.language.iso | eng | en |
dc.publisher | International Speech Communication Association | en_US |
dc.relation.ispartof | Interspeech | en_US |
dc.rights | © 2012 ISCA | en_US |
dc.subject.other | Spoken term indexing | en_US |
dc.subject.other | Finite state transducer | en_US |
dc.subject.other | Spoken term detection | en_US |
dc.subject.other | Speech recognition | en_US |
dc.title | N-gram FST indexing for spoken term detection | en_US |
dc.type | conferenceObject | en |
dc.subject.eciencia | Informática | es_ES |
dc.subject.eciencia | Telecomunicaciones | es_ES |
dc.relation.publisherversion | http://www.isca-speech.org/archive/interspeech_2012/i12_2093.html | |
dc.identifier.publicationfirstpage | 2093 | |
dc.identifier.publicationlastpage | 2096 | |
dc.relation.eventdate | September 9-13, 2012 | en_US |
dc.relation.eventnumber | 13 | |
dc.relation.eventplace | Portland (United States) | en_US |
dc.relation.eventtitle | 13th Annual Conference of the International Speech Communication Association, INTERSPEECH 2012 | en_US |
dc.type.version | info:eu-repo/semantics/publishedVersion | en |
dc.contributor.group | Laboratorio de Tecnología Hombre-Computador (ING EPS-010) | es_ES |
dc.rights.accessRights | openAccess | en |
dc.authorUAM | Tejedor Noguerales, Javier (261273) | |
dc.facultadUAM | Escuela Politécnica Superior | |