Building a confused character set for Chinese spell checking

Lung Hao Lee, Wun Syuan Wu, Jian Hong Li, Yu Chi Lin, Yuen Hsien Tseng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

In this paper, we describe the construction details of a confused character set for Chinese spell checking. The SIGHAN 2013-2015 bakeoff datasets are adopted to measure the performance of correct character suggestions. Our confusion set significantly outperforms the existing confusion set in candidate selection for automatic spelling checkers.

Original languageEnglish
Title of host publicationICCE 2019 - 27th International Conference on Computers in Education, Proceedings
EditorsMaiga Chang, Hyo-Jeong So, Lung-Hsiang Wong, Fu-Yun Yu, Ju-Ling Shih, Ivica Boticki, Ming-Puu Chen, Ali Dewan, Stian Haklev, Elizabeth Koh, Tomoko Kojiri, Kuo-Chen Li, Daner Sun, Yun Wen
PublisherAsia-Pacific Society for Computers in Education
Pages703-705
Number of pages3
ISBN (Electronic)9789869721431
StatePublished - 19 Nov 2019
Event27th International Conference on Computers in Education, ICCE 2019 - Kenting, Taiwan
Duration: 2 Dec 20196 Dec 2019

Publication series

NameICCE 2019 - 27th International Conference on Computers in Education, Proceedings
Volume1

Conference

Conference27th International Conference on Computers in Education, ICCE 2019
Country/TerritoryTaiwan
CityKenting
Period2/12/196/12/19

Keywords

  • Chinese spell checking
  • Confusion set
  • Pronunciation similarity
  • Shape similarity

Fingerprint

Dive into the research topics of 'Building a confused character set for Chinese spell checking'. Together they form a unique fingerprint.

Cite this