WikiSense: Supersense tagging of Wikipedia named entities based WordNet

Joseph Chang, Richard Tzong Han Tsai, Jason S. Chang

研究成果: 書貢獻/報告類型會議論文篇章同行評審

8 引文 斯高帕斯(Scopus)

摘要

In this paper, we introduce a minimally supervised method for learning to classify named-entity titles in a given encyclopedia into broad semantic categories in an existing ontology. Our main idea involves using overlapping entries in the encyclopedia and ontology and a small set of 30 handed tagged parenthetic explanations to automatically generate the training data. The proposed method involves automatically recognizing whether a title is a named entity, automatically generating two sets of training data, and automatically building a classification model for training a classification model based on textual and non-textual features. We present WikiSense, an implementation of the proposed method for extending the named entity coverage of WordNet by sense tagging Wikipedia titles. Experimental results show WikiSense achieves accuracy of over 95% and near 80% applicability for all NE titles in Wikipedia. WikiSense cleanly produces over 1.2 million of NEs tagged with broad categories, based on the lexicographers' files of WordNet, effectively extending WordNet to form a very large scale semantic category, a potentially useful resource for many natural language related tasks.

原文???core.languages.en_GB???
主出版物標題PACLIC 23 - Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation
發行者City University of Hong Kong Press
頁面72-81
頁數10
ISBN(列印)9789624423198
出版狀態已出版 - 2009
事件23rd Pacific Asia Conference on Language, Information and Computation, PACLIC 23 - Hong Kong, China
持續時間: 3 12月 20095 12月 2009

出版系列

名字PACLIC 23 - Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation
1

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???23rd Pacific Asia Conference on Language, Information and Computation, PACLIC 23
國家/地區China
城市Hong Kong
期間3/12/095/12/09

指紋

深入研究「WikiSense: Supersense tagging of Wikipedia named entities based WordNet」主題。共同形成了獨特的指紋。

引用此