Applying maximum entropy to robust Chinese shallow parsing

Shih Hung Wu, Cheng Wei Shih, Chia Wei Wu, Tzong Han Tsai, Wen Lian Hsu

Research output: Contribution to conferencePaperpeer-review

8 Scopus citations

Abstract

Recently, shallow parsing has been applied to various information processing systems, such as information retrieval, information extraction, question answering, and automatic document summarization. A shallow parser is suitable for online applications, because it is much more efficient and less demanding than a full parser. In this research, we formulate shallow parsing as a sequential tagging problem and use a supervised machine learning technique, Maximum Entropy (ME), to build a Chinese shallow parser. The major features of the ME-based shallow parser are POSs and the context words in a sentence. We adopt the shallow parsing results of Sinica Treebank as our standard, and select 30,000 and 10,000 sentences from Sinica Treebank as the training set and test set respectively. We then test the robustness of the shallow parser with noisy data. The experiment results show that the proposed shallow parser is quite robust for sentences with unknown proper nouns.

Original languageEnglish
StatePublished - 2005
Event17th Conference on Computational Linguistics and Speech Processing, ROCLING 2005 - Tainan, Taiwan
Duration: 15 Sep 200516 Sep 2005

Conference

Conference17th Conference on Computational Linguistics and Speech Processing, ROCLING 2005
Country/TerritoryTaiwan
CityTainan
Period15/09/0516/09/05

Fingerprint

Dive into the research topics of 'Applying maximum entropy to robust Chinese shallow parsing'. Together they form a unique fingerprint.

Cite this