Projects per year
Abstract
Online review helpfulness prediction is an important research issue in electronic commerce and data mining. However, the collected datasets used for the analysis and prediction of the helpfulness of online reviews often contain some missing attribute values, such as reviewer background and rating information. In related literatures, many studies have either used the case deletion approach to remove the data containing missing values or considered the imputation of missing values by the mean/mode method. However, none of them consider the direct handling approach without missing value imputation for online review datasets by decision tree-related techniques. Therefore, in this paper, we investigate the suitability of different types of approaches to solve the incomplete dataset problem of online reviews. Specifically, for missing value imputation, several supervised learning techniques including MICE, KNN, SVM, and CART are examined. Moreover, for the direct handling approach without missing value imputation, CART is also performed for this task. The experimental results based on the TripAdvisor dataset for review helpfulness prediction show that the approach where incomplete online review datasets are handled directly without imputation by CART significantly outperforms the other approaches, including case deletion and missing value imputation approaches.
Original language | English |
---|---|
Pages (from-to) | 971-987 |
Number of pages | 17 |
Journal | Journal of Experimental and Theoretical Artificial Intelligence |
Volume | 34 |
Issue number | 6 |
DOIs | |
State | Published - 2022 |
Keywords
- Online reviews
- data mining
- incomplete datasets
- missing values
Fingerprint
Dive into the research topics of 'An investigation of solutions for handling incomplete online review datasets with missing values'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Past, Present, and Future for Missing Value Imputation(3/3)
Tsai, C.-F. (PI)
1/08/18 → 31/07/19
Project: Research