Semi-supervised Sequence Labeling for Named Entity Extraction based on Tri-Training: Case Study on Chinese Person Name Extraction

Chien Lung Chou, Chia Hui Chang, Shin Yi Wu

研究成果: 書貢獻/報告類型會議論文篇章同行評審

2 引文 斯高帕斯(Scopus)

摘要

Named entity extraction is a fundamental task for many knowledge engineering applications. Existing studies rely on annotated training data, which is quite expensive when used to obtain large data sets, limiting the effectiveness of recognition. In this research, we propose an automatic labeling procedure to prepare training data from structured resources which contain known named entities. While this automatically labeled training data may contain noise, a self-testing procedure may be used as a follow-up to remove low-confidence annotation and increase the extraction performance with less training data. In addition to the preparation of labeled training data, we also employed semi-supervised learning to utilize large unlabeled training data. By modifying tri-training for sequence labeling and deriving the proper initialization, we can further improve entity extraction. In the task of Chinese personal name extraction with 364,685 sentences (8,672 news articles) and 54,449 (11,856 distinct) person names, an F-measure of 90.4% can be achieved.

原文???core.languages.en_GB???
主出版物標題SWAIE 2014 - 3rd Workshop on SemanticWeb and Information Extraction, Proceedings of the Workshop
編輯Diana Maynard, Marieke van Erp, Brian Davis
發行者Association for Computational Linguistics (ACL)
頁面33-40
頁數8
ISBN(電子)9781873769485
出版狀態已出版 - 2014
事件3rd Workshop on SemanticWeb and Information Extraction, SWAIE 2014 - Dublin, Ireland
持續時間: 24 8月 2014 → …

出版系列

名字SWAIE 2014 - 3rd Workshop on SemanticWeb and Information Extraction, Proceedings of the Workshop

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???3rd Workshop on SemanticWeb and Information Extraction, SWAIE 2014
國家/地區Ireland
城市Dublin
期間24/08/14 → …

指紋

深入研究「Semi-supervised Sequence Labeling for Named Entity Extraction based on Tri-Training: Case Study on Chinese Person Name Extraction」主題。共同形成了獨特的指紋。

引用此