Cross-language article linking using cross-encyclopedia entity embedding

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Cross-language article linking (CLAL) is the task of finding corresponding article pairs of different languages across encyclopedias. This task is a difficult disambiguation problem in which one article must be selected among several candidate articles with similar titles and contents. Existing works focus on engineering text-based or link-based features for this task, which is a time-consuming job, and some of these features are only applicable within the same encyclopedia. In this paper, we address these problems by proposing crossencyclopedia entity embedding. Unlike other works, our proposed method does not rely on known cross-language pairs. We apply our method to CLAL between English Wikipedia and Chinese Baidu Baike. Our features improve performance relative to the baseline by 29.62%. Tested 30 times, our system achieved an average improvement of 2.76% over the current best system (26.86% over baseline), a statistically significant result.

Original languageEnglish
Title of host publicationShort Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages334-339
Number of pages6
ISBN (Electronic)9781948087292
StatePublished - 2018
Event2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018 - New Orleans, United States
Duration: 1 Jun 20186 Jun 2018

Publication series

NameNAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
Volume2

Conference

Conference2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018
Country/TerritoryUnited States
CityNew Orleans
Period1/06/186/06/18

Fingerprint

Dive into the research topics of 'Cross-language article linking using cross-encyclopedia entity embedding'. Together they form a unique fingerprint.

Cite this