Heterogeneous embedding propagation for large-scale e-commerce user alignment

Vincent W. ZHENG
Mo SHA
Yuchen LI, Singapore Management University
Hongxia YANG
Zhenjie ZHANG
Kian-Lee TAN

Abstract

We study the important problem of user alignment in e-commerce: to predict whether two online user identitiesthat access an e-commerce site from different devices belong toone real-world person. As input, we have a set of user activitylogs from Taobao and some labeled user identity linkages. Useractivity logs can be modeled using a heterogeneous interactiongraph (HIG), and subsequently the user alignment task canbe formulated as a semi-supervised HIG embedding problem.HIG embedding is challenging for two reasons: its heterogeneousnature and the presence of edge features. To address thechallenges, we propose a novel Heterogeneous Embedding Prop-agation (HEP) model. The core idea is to iteratively reconstruct anode’s embedding from its heterogeneous neighbors in a weightedmanner, and meanwhile propagate its embedding updates fromreconstruction loss and/or classification loss to its neighbors.We conduct extensive experiments on large-scale datasets fromTaobao, demonstrating that HEP significantly outperforms state-of-the-art baselines often by more than 10% in F-scores.