Cross-language article linking with different knowledge bases using bilingual topic model and translation features

Yu Chun Wang, Chun Kai Wu, Richard Tzong Han Tsai

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Creating links among online encyclopedia articles in different languages is crucial in the construction and integration of large multilingual knowledge bases. Most research to date has focused on linking among different language versions of Wikipedia, yet other large online encyclopedias in a variety of languages exist. In this work, we present a cross-language article-linking method using a bilingual topic model and translation features based on an SVM model to link articles in English Wikipedia and Chinese Baidu Baike, the most widely used Wiki-like encyclopedia in China. To evaluate our approach, we compile data sets from Baidu Baike articles and their corresponding English Wikipedia articles. The evaluation results show that our approach achieves at most 0.8158 in MRR, outperforming the baseline system by 0.1328 (+19.44%) in MRR. Our method does not heavily depend on linguistic characteristics, and it can be easily extended to generate cross-language article links among different online encyclopedias in other languages.

Original languageEnglish
Pages (from-to)228-236
Number of pages9
JournalKnowledge-Based Systems
Volume111
DOIs
StatePublished - 1 Nov 2016

Keywords

  • Bilingual topic model
  • Cross-language article linking
  • Link discovery

Fingerprint

Dive into the research topics of 'Cross-language article linking with different knowledge bases using bilingual topic model and translation features'. Together they form a unique fingerprint.

Cite this