Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
10-2013
Abstract
The problem of tagging is mostly considered from the perspectives of machine learning and data-driven philosophy. A fundamental issue that underlies the success of these approaches is the visual similarity, ranging from the nearest neighbor search to manifold learning, to identify similar instances of an example for tag completion. The need to searching for millions of visual examples in high-dimensional feature space, however, makes the task computationally expensive. Moreover, the results can suffer from robustness problem, when the underlying data, such as online videos, are rich of semantics and the similarity is difficult to be learnt from low-level features. This paper studies the exploration of user searching behavior through click-through data, which is largely available and freely accessible by search engines, for learning video relationship and applying the relationship for economic way of annotating online videos. We demonstrated that, by a simple approach using co-click statistics, promising results were obtained in contrast to feature-based similarity measurement. Furthermore, considering the long tail effect that few videos dominate most clicks, a new method based on polynomial semantic indexing is proposed to learn a latent space for alleviating the sparsity problem of click-through data. The proposed approaches are then applied for three major tasks in tagging: tag assignment, ranking, and enrichment. On a bipartite graph constructed from click-through data with over 15 million queries and 20 million video URL clicks, we showed that annotation can be performed for free with competitive performance and minimum computing resource, representing a new and promising paradigm for video tagging in addition to machine learning and data-driven methodologies.
Keywords
Annotation, Click-through data, Tag assignment, Tag enrich- ment, Tag ranking, Video search, Video tagging
Discipline
Data Storage Systems | Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Publication
MM '13: Proceedings of the 21st ACM International Conference on Multimedia: October 21-25, Barcelona, Spain
First Page
977
Last Page
986
ISBN
9781450324045
Identifier
10.1145/2502081.2502085
Publisher
ACM
City or Country
Barcelona, Spain
Citation
TING, Yao; MEI, Tao; NGO, Chong-wah; and LI, Shipeng.
Annotation for free: Video tagging by mining user search behavior. (2013). MM '13: Proceedings of the 21st ACM International Conference on Multimedia: October 21-25, Barcelona, Spain. 977-986.
Available at: https://ink.library.smu.edu.sg/sis_research/6467
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.