Using XGBoost and Skip-Gram Model to Predict Online Review Popularity

Lien Thi Kim Nguyen, Hao Hsuan Chung, Kristine Velasquez Tuliao, Tom M.Y. Lin

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Review popularity is similar to awareness and information accessibility components: Both have a profound effect on customer purchase decisions. Therefore, this study proposes a new method for predicting online review popularity that combines the extreme gradient boosting tree algorithm (XGBoost), to extract key features on the bases of ranking scores and the skip-gram model, which can subsequently identify semantic words according to key textual terms. Findings revealed that written reviews had higher review popularity than non-textual reviews (reviewer and product factors). Moreover, the proposed method achieved higher prediction accuracy than the traditional ridge regression technique of Root Mean Squared Logarithmic Error (RMSLE). The main factors affecting review popularity and key reviewers for specific textual terms were also identified. Findings could help vendors identify key influencers for their product promotion and then support the design of word-suggestion systems for online reviews.

Original languageEnglish
JournalSAGE Open
Issue number4
StatePublished - 2020


  • extreme gradient boosting tree algorithm
  • online word of mouth
  • predictive models
  • review popularity
  • skip-gram model


Dive into the research topics of 'Using XGBoost and Skip-Gram Model to Predict Online Review Popularity'. Together they form a unique fingerprint.

Cite this