Near-synonym substitution using a discriminative vector space model

Liang Chih Yu, Lung Hao Lee, Jui Feng Yeh, Hsiu Min Shih, Yu Ling Lai

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Near-synonyms are fundamental and useful knowledge resources for computer-assisted language learning (CALL) applications. For example, in online language learning systems, learners may have a need to express a similar meaning using different words. However, it is usually difficult to choose suitable near-synonyms to fit a given context because the differences of near-synonyms are not easily grasped in practical use, especially for second language (L2) learners. Accordingly, it is worth developing algorithms to verify whether near-synonyms match given contexts. Such algorithms could be used in applications to assist L2 learners in discovering the collocational differences between near-synonyms. We propose a discriminative vector space model for the near-synonym substitution task, and consider this task as a classification task. There are two components: a vector space model and discriminative training. The vector space model is used as a baseline classifier to classify test examples into one of the near-synonyms in a given near-synonym set. A discriminative training technique is then employed to improve the vector space model by distinguishing positive and negative features for each near-synonym. Experimental results show that the DT-VSM achieves higher accuracy than both pointwise mutual information and n-gram-based methods that have been used in previous studies.

Original languageEnglish
Pages (from-to)74-84
Number of pages11
JournalKnowledge-Based Systems
Volume106
DOIs
StatePublished - 15 Aug 2016

Keywords

  • Discriminative training
  • Lexical substitution
  • Natural language processing
  • Near-synonym learning
  • Vector space model

Fingerprint

Dive into the research topics of 'Near-synonym substitution using a discriminative vector space model'. Together they form a unique fingerprint.

Cite this