Mining features for web ner model construction based on distant learning

Chien Lung Chou, Chia Hui Chang

研究成果: 書貢獻/報告類型會議論文篇章同行評審

1 引文 斯高帕斯(Scopus)

摘要

In this paper, we study the problem of developing a WIDM NER tool to prepare training corpus from the Web for custom named entity recognition (NER) models via distant learning. We consider two major issues including efficient automatic labelling and effective feature mining for training accurate NER models via sequence labelling technique. While the idea of collecting training sentences from search snippets via known entities (seeds) is not new, efficient automatic labelling becomes an issue when we have a large number of seeds (e.g. 500K) and sentences (e.g. 2M). The second issue regards the mining of interesting terms or k-grams as features for supervised learning. We conduct experiments on four types of entity recognition including Chinese person name, food name, location name, and point of interest (POI) to demonstrate the improvement in efficiency and effectiveness with the proposed Web NER model construction tool.

原文???core.languages.en_GB???
主出版物標題Proceedings of the 2017 International Conference on Asian Language Processing, IALP 2017
編輯Rong Tong, Yue Zhang, Yanfeng Lu, Minghui Dong
發行者Institute of Electrical and Electronics Engineers Inc.
頁面322-325
頁數4
ISBN(電子)9781538619803
DOIs
出版狀態已出版 - 2 7月 2017
事件21st International Conference on Asian Language Processing, IALP 2017 - Singapore, Singapore
持續時間: 5 12月 20177 12月 2017

出版系列

名字Proceedings of the 2017 International Conference on Asian Language Processing, IALP 2017
2018-January

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???21st International Conference on Asian Language Processing, IALP 2017
國家/地區Singapore
城市Singapore
期間5/12/177/12/17

指紋

深入研究「Mining features for web ner model construction based on distant learning」主題。共同形成了獨特的指紋。

引用此