Semi-joint labeling for Chinese named entity recognition

Chia Wei Wu, Richard Tzong Han Tsai, Wen Lian Hsu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Named entity recognition (NER) is an essential component of text mining applications. In Chinese sentences, words do not have delimiters; thus, incorporating word segmentation information into an NER model can improve its performance. Based on the framework of dynamic conditional random fields, we propose a novel labeling format, called semi-joint labeling which partially integrates word segmentation information and named entity tags for NER. The model enhances the interaction of segmentation tags and NER achieved by traditional approaches. Moreover, it allows us to consider interactions between multiple chains in a linear-chain model. We use data from the SIGHAN 2006 NER bakeoff to evaluate the proposed model. The experimental results demonstrate that our approach outperforms state-of-the-art systems.

Original languageEnglish
Title of host publicationInformation Retrieval Technology - 4th Asia Information Retrieval Symposium, AIRS 2008, Revised Selected Papers
Pages107-116
Number of pages10
DOIs
StatePublished - 2008
Event4th Asia Information Retrieval Symposium, AIRS 2008 - Harbin, China
Duration: 15 Jan 200818 Jan 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4993 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference4th Asia Information Retrieval Symposium, AIRS 2008
Country/TerritoryChina
CityHarbin
Period15/01/0818/01/08

Keywords

  • Chinese word segmentation
  • Named entity recognition

Fingerprint

Dive into the research topics of 'Semi-joint labeling for Chinese named entity recognition'. Together they form a unique fingerprint.

Cite this