Hierarchical text clustering applied to taxonomy evaluation

Muñoz Hidalgo, Samuel

UAM_Biblioteca

dc.contributor.advisor	Camacho, David
dc.contributor.author	Muñoz Hidalgo, Samuel
dc.contributor.other	UAM. Departamento de Ingeniería Informática	es_ES
dc.date.accessioned	2014-11-14T15:00:40Z
dc.date.available	2014-11-14T15:00:40Z
dc.date.issued	2014
dc.identifier.uri	http://hdl.handle.net/10486/662544
dc.description	Master’s Degree in Research and Innovation Information and Communications Technologies	en_US
dc.description.abstract	In computer science, the use for taxonomies is widely embraced in fields such as Artifial Inteligence, Information Retrieval, Natural Language Processing or Machine Learning. This concept classifications provide knowledge structures to guide algorithms on the task to find an acceptable-to-nearly-optimal solution on non deterministic problems. The main problem with taxonomies is the huge amount of effort that requires to build one. Traditionally, this is done by human means and involves a team of experts to assure the quality of the result. Since this is evidently the way to get the best taxonomy possible (knowledge is an exclusive quality of humans), due to the manpower factor, it seems to be neither the fastest nor the cheapest one. This thesis makes an extensive review of the state of the art on taxonomy induction techniques as well as ontology evaluation methods. It claims the need for a fast, automatic and arbitrary-domain taxonomy generation method and justifies the chose of the Wikipedia encyclopedia as the dataset. A framework to deal with taxonomies is proposed and implemented. In the experiments chapter, two statements are successfully refuted: the Wikipedia categorization system forms an acyclic directed graph, and the longest path between two nodes is equivalent to the taxonomic organization. Finally the framework is used to explore three arbitrary domains.	en_US
dc.format.extent	66 pág.	es_ES
dc.format.mimetype	application/pdf	en
dc.language.iso	eng	en
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject.other	Biología - Clasificación	es_ES
dc.subject.other	Ontología	es_ES
dc.subject.other	Gestión del conocimiento	es_ES
dc.title	Hierarchical text clustering applied to taxonomy evaluation	en_US
dc.type	masterThesis	en
dc.subject.eciencia	Informática	es_ES
dc.rights.cc	Reconocimiento – NoComercial – SinObraDerivada	es_ES
dc.rights.accessRights	openAccess	en
dc.facultadUAM	Escuela Politécnica Superior

Files in this item

Name:: Muñoz_Hidalgo_Samuel_tfm.pdf
Size:: 8.941Mb
Format:: PDF

This item appears in the following Collection(s)

Trabajos de estudiantes (tesis doctorales, TFMs, TFGs, etc.) [19966]

Show simple item record

Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by-nc-nd/4.0/

UAM_Biblioteca

Hierarchical text clustering applied to taxonomy evaluation

Files in this item

This item appears in the following Collection(s)

Related items

Evaluation of negentropy-based cluster validation techniques in problems with increasing dimensionality ﻿

An approach to the visualization of adaptive hypermedia structures and other small-world networks based on hierarchically Clustered Graphs ﻿

Alberta Stroke Program Early CT Score applied to CT angiography source images is a strong predictor of futile recanalization in acute ischemic stroke ﻿

Evaluation of negentropy-based cluster validation techniques in problems with increasing dimensionality

An approach to the visualization of adaptive hypermedia structures and other small-world networks based on hierarchically Clustered Graphs

Alberta Stroke Program Early CT Score applied to CT angiography source images is a strong predictor of futile recanalization in acute ischemic stroke