Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
9-2025
Abstract
In this paper, we examine the hypothesis that the interactions recorded in many Recommendation Systems datasets are distributed according to a low-rank distribution, i.e. a mixture of factorizable distributions. Surprisingly, we find that on several popular datasets, a simple non-negative matrix factorization method equals or outperforms more modern methods such as LightGCN, which indicates that the sampling distribution over interactions is indeed low-rank. Furthermore, we mathematically prove that low-rank distributions are learnable with a sparse number of observations (where m/n and r refer to the number of users/items and the non-negative rank respectively) both in terms of the total variation norm and in terms of the expected recall at k, arguably providing some of the first generalization bounds for recommender systems in the implicit feedback setting. We also provide a modified version of the NMF algorithm which provides further performance improvements compared to the standard NMF baseline on the smaller datasets considered. Finally, we propose the theoretically grounded concept of empirical expected recall as an uncertainty estimate for probabilistic models of the recommendation task, and demonstrate its success in a setting where user-wise abstentions are allowed.
Keywords
Recommendation Systems, Probabilistic Modelling, Probability Mass Function (PMF) Estimation, Low-rank Methods, Nonnegative Matrix Factorization
Discipline
Databases and Information Systems
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
RecSys '25: Proceedings of the 19th ACM Conference on Recommender Systems, Prague, Czech Republic, September 22-26
First Page
1261
Last Page
1266
Identifier
10.1145/3705328.3759332
Publisher
ACM
City or Country
New York
Citation
POERNOMO, Jennifer; TAN, Nicole Gabrielle Lee; ALVES, Rodrigo; and LEDENT, Antoine.
Probabilistic modeling, learnability and uncertainty estimation for interaction prediction in movie rating datasets. (2025). RecSys '25: Proceedings of the 19th ACM Conference on Recommender Systems, Prague, Czech Republic, September 22-26. 1261-1266.
Available at: https://ink.library.smu.edu.sg/sis_research/10416
Copyright Owner and License
Authors
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3705328.3759332