Show simple item record

dc.contributor.advisorCamacho, David
dc.contributor.authorMuñoz Hidalgo, Samuel
dc.contributor.otherUAM. Departamento de Ingeniería Informáticaes_ES
dc.date.accessioned2014-11-14T15:00:40Z
dc.date.available2014-11-14T15:00:40Z
dc.date.issued2014
dc.identifier.urihttp://hdl.handle.net/10486/662544
dc.descriptionMaster’s Degree in Research and Innovation Information and Communications Technologiesen_US
dc.description.abstractIn computer science, the use for taxonomies is widely embraced in fields such as Artifial Inteligence, Information Retrieval, Natural Language Processing or Machine Learning. This concept classifications provide knowledge structures to guide algorithms on the task to find an acceptable-to-nearly-optimal solution on non deterministic problems. The main problem with taxonomies is the huge amount of effort that requires to build one. Traditionally, this is done by human means and involves a team of experts to assure the quality of the result. Since this is evidently the way to get the best taxonomy possible (knowledge is an exclusive quality of humans), due to the manpower factor, it seems to be neither the fastest nor the cheapest one. This thesis makes an extensive review of the state of the art on taxonomy induction techniques as well as ontology evaluation methods. It claims the need for a fast, automatic and arbitrary-domain taxonomy generation method and justifies the chose of the Wikipedia encyclopedia as the dataset. A framework to deal with taxonomies is proposed and implemented. In the experiments chapter, two statements are successfully refuted: the Wikipedia categorization system forms an acyclic directed graph, and the longest path between two nodes is equivalent to the taxonomic organization. Finally the framework is used to explore three arbitrary domains.en_US
dc.format.extent66 pág.es_ES
dc.format.mimetypeapplication/pdfen
dc.language.isoengen
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject.otherBiología - Clasificaciónes_ES
dc.subject.otherOntologíaes_ES
dc.subject.otherGestión del conocimientoes_ES
dc.titleHierarchical text clustering applied to taxonomy evaluationen_US
dc.typemasterThesisen
dc.subject.ecienciaInformáticaes_ES
dc.rights.ccReconocimiento – NoComercial – SinObraDerivadaes_ES
dc.rights.accessRightsopenAccessen
dc.facultadUAMEscuela Politécnica Superior


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

https://creativecommons.org/licenses/by-nc-nd/4.0/
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by-nc-nd/4.0/