Textual analysis for studying Chinese historical documents and literary Novels

Chao Lin Liu, Wen Huei Cheng, Guan Tao Jin, Wei Yun Chiu, Hongsu Wang, Richard Tzong Han Tsai, Qing Feng Liu, Yu Chun Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations


We analyzed historical and literary documents in Chinese to gain insights into research issues, and overview1 our studies which utilized four different sources of text materials in this paper. We investigated the history of concepts and transliterated words in China with the Database for the Study of Modern China Thought and Literature, which contains historical documents about China between 1830 and 1930. We also attempted to disambiguate names that were shared by multiple government officers who served between 618 and 1912 and were recorded in Chinese local gazetteers (/di4 fang1 zhi4/). To showcase the potentials and challenges of computer-assisted analysis of Chinese literatures, we explored some interesting yet non-trivial questions about two of the Four Great Classical Novels of China: (1) Which monsters attempted to consume the Buddhist monk Xuanzang in the Journey to the West (/xi1 you2 ji4/, JTTW), which was published in the 16th century, (2) Which was the most powerful monster in JTTW, and (3) Which major role smiled the most in the Dream of the Red Chamber (/hong2 lou2 meng4/), which was published in the 18th century. Similar approaches can be applied to the analysis and study of modern documents, such as the newspaper articles published about the 228 incident that occurred in 1947 in Taiwan. Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Original languageEnglish
Title of host publicationProceedings of the ASE BigData and SocialInformatics 2015, ASE BD and SI 2015
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450337359
StatePublished - 7 Oct 2015
EventASE BigData and SocialInformatics, ASE BD and SI 2015 - Kaohsiung, Taiwan
Duration: 7 Oct 20159 Oct 2015

Publication series

NameACM International Conference Proceeding Series


ConferenceASE BigData and SocialInformatics, ASE BD and SI 2015


  • 228 incident in Taiwan
  • Computational linguistics
  • Digital humanities
  • Geographical analysis
  • History of concepts
  • Keyword collocation
  • Name disambiguation
  • Named entity recognition
  • Temporal analysis
  • Text mining
  • Textual analysis
  • Transliterated words in Chinese historical documents

Cite this