Abstract
The number of received citations have been used as an indicator of the impact of academic publications. Developing tools to find papers that have the potential to become highly-cited has recently attracted increasing scientific attention. Topics of concern by scholars may change over time in accordance with research trends, resulting in changes in received citations. Author-defined keywords, title and abstract provide valuable information about a research article. This study performs a latent Dirichlet allocation technique to extract topics and keywords from articles; five keyword popularity (KP) features are defined as indicators of emerging trends of articles. Binary classification models are utilized to predict papers that were highly-cited or less highly-cited by a number of supervised learning techniques. We empirically compare KP features of articles with other commonly used journal-related and author-related features proposed in previous studies. The results show that, with KP features, the prediction models are more effective than those with journal and/or author features, especially in the management information system discipline.
Original language | English |
---|---|
Article number | 101004 |
Journal | Journal of Informetrics |
Volume | 14 |
Issue number | 1 |
DOIs | |
State | Published - Feb 2020 |
Keywords
- binary classification
- highly-cited papers
- keyword popularity
- supervised learning
- topic model
Fingerprint
Dive into the research topics of 'Identification of highly-cited papers using topic-model-based and bibliometric features: The consideration of keyword popularity'. Together they form a unique fingerprint.Datasets
-
Data for: Identification of highly-cited papers using topic-model-based and bibliometric features: the consideration of keyword popularity
Hu, Y.-H. (Contributor), Tai, C.-T. (Contributor), Liu, K. E. (Contributor) & Cai, C.-F. (Contributor), Mendeley Data, 13 Jan 2020
DOI: 10.17632/bvbvyhdwxw.1, https://data.mendeley.com/datasets/bvbvyhdwxw
Dataset