Show simple item record

dc.contributor.authorCebrián Ramos, Manuel
dc.contributor.authorAlfonseca, Manuel
dc.contributor.authorOrtega de la Puente, Alfonso 
dc.contributor.otherUAM. Departamento de Ingeniería Informáticaes_ES
dc.date.accessioned2015-01-14T19:14:18Z
dc.date.available2015-01-14T19:14:18Z
dc.date.issued2005
dc.identifier.citationCommunications in Information and Systems 5.4 (2005): 367-384en_US
dc.identifier.issn1526-7555 (print)en_US
dc.identifier.issn2163-4548 (online)en_US
dc.identifier.urihttp://hdl.handle.net/10486/663140
dc.description.abstractUsing the mathematical background for algorithmic complexity developed by Kolmogorov in the sixties, Cilibrasi and Vitanyi have designed a similarity distance named normalized compression distance applicable to the clustering of objects of any kind, such as music, texts or gene sequences. The normalized compression distance is a quasi-universal normalized admissible distance under certain conditions. This paper shows that the compressors used to compute the normalized compression distance are not idempotent in some cases, being strongly skewed with the size of the objects and window size, and therefore causing a deviation in the identity property of the distance if we don't take care that the objects to be compressed fit the windows. The relationship underlying the precision of the distance and the size of the objects has been analyzed for several well-known compressors, and specially in depth for three cases, bzip2, gzip and PPMZ which are examples of the three main types of compressors: block-sorting, Lempel-Ziv, and statistic.en_US
dc.description.sponsorshipThis work was partially supported by grant TSI 2005- 08255-C07-06 of the Spanish Ministry of Education and Science.en_US
dc.format.extent18 pág.es_ES
dc.format.mimetypeapplication/pdfen
dc.language.isoengen
dc.publisherInternational Press of Bostonen_US
dc.relation.ispartofCommunications in Information and Systemsen_US
dc.rights© International Press 2005en_US
dc.titleCommon Pitfalls Using the Normalized Compression Distance: What to Watch Out for in a Compressoren_US
dc.typearticleen_US
dc.subject.ecienciaInformáticaes_ES
dc.relation.publisherversionhttp://dx.doi.org/10.4310/CIS.2005.v5.n4.a1
dc.identifier.doi10.4310/CIS.2005.v5.n4.a1
dc.identifier.publicationfirstpage367
dc.identifier.publicationissue4
dc.identifier.publicationlastpage384
dc.identifier.publicationvolume5
dc.type.versioninfo:eu-repo/semantics/publishedVersionen
dc.contributor.groupHerramientas Interactivas Avanzadas (ING EPS-003)es_ES
dc.rights.accessRightsopenAccessen
dc.authorUAMAlfonseca Moreno, Manuel (258923)
dc.facultadUAMEscuela Politécnica Superior


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record