Description of the NCU Chinese word segmentation and part-of-speech tagging for SIGHAN Bakeoff 2007

Yu Chieh Wu, Jie Chi Yang, Yue Shi Lee

研究成果: 會議貢獻類型會議論文同行評審

2 引文 斯高帕斯(Scopus)

摘要

In Chinese, most of the language processing starts from word segmentation and part-of-speech (POS) tagging. These two steps tokenize the word from a sequence of characters and predict the syntactic labels for each segmented word. In this paper, we present two distinct sequential tagging models for the above two tasks. The first word segmentation model was basically similar to previous work which made use of conditional random fields (CRF) and set of predefined dictionaries to recognize word boundaries. Second, we revise and modify support vector machine-based chunking model to label the POS tag in the tagging task. Our method in the WS task achieves moderately rank among all participants, while in the POS tagging task, it reaches very competitive results.

原文???core.languages.en_GB???
頁面161-166
頁數6
出版狀態已出版 - 2008
事件6th SIGHAN Workshop on Chinese Language Processing, SIGHAN 2008 - Hyderabad, India
持續時間: 11 1月 200812 1月 2008

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???6th SIGHAN Workshop on Chinese Language Processing, SIGHAN 2008
國家/地區India
城市Hyderabad
期間11/01/0812/01/08

指紋

深入研究「Description of the NCU Chinese word segmentation and part-of-speech tagging for SIGHAN Bakeoff 2007」主題。共同形成了獨特的指紋。

引用此