Mencius: A Chinese named entity recognizer using hybrid model

Tzong Han Tsai, Shih Hung Wu, Wen Lian Hsu

研究成果: 會議貢獻類型會議論文同行評審

摘要

This paper presents a maximum entropy based Chinese named entity recognizer (NER): Mencius. It aims to address Chinese NER problems by combining the advantages of rule-based and machine learning (ML) based NER systems. Rule-based NER systems can explicitly encode human comprehension and can be tuned conveniently, while ML-based systems are robust, portable and inexpensive to develop. Our hybrid system incorporates a rule-based knowledge representation and template-matching tool, InfoMap [1], into a maximum entropy (ME) framework. Named entities are represented in InfoMap as templates, which serve as ME features in Mencius. These features are edited manually and their weights are estimated by the ME framework according to the training data. To avoid the errors caused by word segmentation, we model the NER problem as a character-based tagging problem. In our experiments, Mencius outperforms both pure rule-based and pure ME-based NER systems. The F-Measures of person names (PER), location names (LOC) and organization names (ORG) in the experiment are respectively 92.4%, 73.7% and 75.3%.

原文???core.languages.en_GB???
出版狀態已出版 - 2003
事件15th Conference on Computational Linguistics and Speech Processing, ROCLING 2003 - Hsinchu, Taiwan
持續時間: 1 9月 2003 → …

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???15th Conference on Computational Linguistics and Speech Processing, ROCLING 2003
國家/地區Taiwan
城市Hsinchu
期間1/09/03 → …

指紋

深入研究「Mencius: A Chinese named entity recognizer using hybrid model」主題。共同形成了獨特的指紋。

引用此