Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
10-2025
Abstract
Learning user preferences in recommendation systems is enriched by multimodal features, such as textual and visual content, and amplified by multi-interest modeling with Variational AutoEncoders (VAEs). However, prior efforts are limited by single modality focus and cumbersome, parameter-heavy architecture designs. To address these limitations, we introduce an innovative solution that blends the semantic richness of multimodal data with the representational power of multi-representation VAEs. Drawing inspiration from Mixture of Experts (MoE), we cast each VAE as an expert tailored to a specific modality, then fuse them via a novel parameter-merging function into a lean, unified model. This approach efficiently captures diverse user preferences behind multimodal data with minimal complexity. Rigorous experiments on real-world benchmarks show our method outshines state-of-the-art baselines while slashing parameter counts. Our work sets a new, streamlined standard for multimodal, multi-interest recommendation systems.
Keywords
multimodal recommendation, multi-representation VAE, parametermerging
Discipline
Artificial Intelligence and Robotics
Research Areas
Data Science and Engineering
Publication
MM '25: Proceedings of the 33rd ACM International Conference on Multimedia, Dublin, Ireland, October 27-31
First Page
6412
Last Page
6420
Identifier
10.1145/3746027.3754597
Publisher
ACM
City or Country
New York
Citation
TRAN, Nhu Thuat and LAUW, Hady Wirawan.
Parameter-efficient variational autoencoder for multimodal multi-interest recommendation. (2025). MM '25: Proceedings of the 33rd ACM International Conference on Multimedia, Dublin, Ireland, October 27-31. 6412-6420.
Available at: https://ink.library.smu.edu.sg/sis_research/10701
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3746027.3754597