The number of received citations have been used as an indicator of the impact of academic publications. Developing tools to find papers that have the potential to become highly-cited has recently attracted increasing scientific attention. Topics of concern by scholars may change over time in accordance with research trends, resulting in changes in received citations. Author-defined keywords, title and abstract provide valuable information about a research article. This study performs a latent Dirichlet allocation technique to extract topics and keywords from articles; five keyword popularity (KP) features are defined as indicators of emerging trends of articles. Binary classification models are utilized to predict papers that were highly-cited or less highly-cited by a number of supervised learning techniques. We empirically compare KP features of articles with other commonly used journal-related and author-related features proposed in previous studies. The results show that, with KP features, the prediction models are more effective than those with journal and/or author features, especially in the management information system discipline.
|Journal||Journal of Informetrics|
|State||Published - Feb 2020|
- binary classification
- highly-cited papers
- keyword popularity
- supervised learning
- topic model
FingerprintDive into the research topics of 'Identification of highly-cited papers using topic-model-based and bibliometric features: The consideration of keyword popularity'. Together they form a unique fingerprint.
Data for: Identification of highly-cited papers using topic-model-based and bibliometric features: the consideration of keyword popularity
Hu, Y. (Contributor), Tai, C. (Contributor), Liu, K. E. (Contributor) & Cai, C. (Contributor), Mendeley Data, 2020
DOI: 10.17632/bvbvyhdwxw.1, https://data.mendeley.com/datasets/bvbvyhdwxw