TY - JOUR
T1 - Differentiating regularization weights - A simple mechanism to alleviate cold start in recommender systems
AU - Chen, Hung Hsuan
AU - Chen, Pu
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s).
PY - 2019/1
Y1 - 2019/1
N2 - Matrix factorization (MF) and its extended methodologies have been studied extensively in the community of recommender systems in the last decade. Essentially, MF attempts to search for low-ranked matrices that can (1) best approximate the known rating scores, and (2) maintain low Frobenius norm for the low-ranked matrices to prevent overfitting. Since the two objectives conflict with each other, the common practice is to assign the relative importance weights as the hyper-parameters to these objectives. The two low-ranked matrices returned by MF are often interpreted as the latent factors of a user and the latent factors of an item thatwould affect the rating of the user on the item.As a result, it is typical that, in the loss function,we assign a regularization weight λp on the norms of the latent factors for all users, and another regularization weight λq on the norms of the latent factors for all the items.We argue that such amethodology probably over-simplifies the scenario. Alternatively, we probably should assign lower constraints to the latent factors associated with the items or users that reveal more information, and set higher constraints to the others. In this article, we systematically study this topic. We found that such a simple technique can improve the prediction results of theMF-based approaches based on several public datasets. Specifically, we applied the proposed methodology on three baseline models - SVD, SVD++, and the NMF models. We found that this technique improves the prediction accuracy for all these baseline models. Perhaps more importantly, this technique better predicts the ratings on the long-tail items, i.e., the items that were rated/viewed/purchased by few users. This suggests that this approach may partially remedy the cold-start issue. The proposed method is very general and can be easily applied on various recommendation models, such as FactorizationMachines, Field-aware Factorization Machines, Factorizing Personalized Markov Chains, Prod2Vec, Behavior2Vec, and so on.We release the code for reproducibility.We implemented a Python package that integrates the proposed regularization technique with the SVD, SVD++, and the NMF model.
AB - Matrix factorization (MF) and its extended methodologies have been studied extensively in the community of recommender systems in the last decade. Essentially, MF attempts to search for low-ranked matrices that can (1) best approximate the known rating scores, and (2) maintain low Frobenius norm for the low-ranked matrices to prevent overfitting. Since the two objectives conflict with each other, the common practice is to assign the relative importance weights as the hyper-parameters to these objectives. The two low-ranked matrices returned by MF are often interpreted as the latent factors of a user and the latent factors of an item thatwould affect the rating of the user on the item.As a result, it is typical that, in the loss function,we assign a regularization weight λp on the norms of the latent factors for all users, and another regularization weight λq on the norms of the latent factors for all the items.We argue that such amethodology probably over-simplifies the scenario. Alternatively, we probably should assign lower constraints to the latent factors associated with the items or users that reveal more information, and set higher constraints to the others. In this article, we systematically study this topic. We found that such a simple technique can improve the prediction results of theMF-based approaches based on several public datasets. Specifically, we applied the proposed methodology on three baseline models - SVD, SVD++, and the NMF models. We found that this technique improves the prediction accuracy for all these baseline models. Perhaps more importantly, this technique better predicts the ratings on the long-tail items, i.e., the items that were rated/viewed/purchased by few users. This suggests that this approach may partially remedy the cold-start issue. The proposed method is very general and can be easily applied on various recommendation models, such as FactorizationMachines, Field-aware Factorization Machines, Factorizing Personalized Markov Chains, Prod2Vec, Behavior2Vec, and so on.We release the code for reproducibility.We implemented a Python package that integrates the proposed regularization technique with the SVD, SVD++, and the NMF model.
KW - Recommender systems
KW - SVD
KW - SVD++
KW - cold start
KW - collaborative filtering
KW - long tail
KW - matrix factorization
UR - http://www.scopus.com/inward/record.url?scp=85061175486&partnerID=8YFLogxK
U2 - 10.1145/3285954
DO - 10.1145/3285954
M3 - 期刊論文
AN - SCOPUS:85061175486
SN - 1556-4681
VL - 13
JO - ACM Transactions on Knowledge Discovery from Data
JF - ACM Transactions on Knowledge Discovery from Data
IS - 1
M1 - A8
ER -