Publication Type

Book Chapter

Version

publishedVersion

Publication Date

5-2019

Abstract

Effective indexing of social media data is key to searching for information on the social Web. However, the characteristics of social media data make it a challenging task. The large-scale and streaming nature is the first challenge, which requires the indexing algorithm to be able to efficiently update the indexing structure when receiving data streams. The second challenge is utilizing the rich meta-information of social media data for a better evaluation of the similarity between data objects and for a more semantically meaningful indexing of the data, which may allow the users to search for them using the different types of queries they like. Existing approaches based on either matrix operations or hashing usually cannot perform an online update of the indexing base to encode upcoming data streams, and they have difficulty handling noisy data. This chapter presents a study on using the Online Multimodal Co-indexing Adaptive Resonance Theory (OMC-ART) for an effective and efficient indexing and retrieval of social media data. More specifically, two types of social media data are considered: (1) the weakly supervised image data, which is associated with captions, tags and descriptions given by the users; and (2) the e-commerce product data, which includes product images, titles, descriptions and user comments. These scenarios make this study related to multimodal web image indexing and retrieval. Compared with existing studies, OMC-ARTonline multimodal co-indexing adaptive resonance theory has several distinct characteristics. First, OMC-ART is able to perform online learning of sequential data. Second, instead of a plain indexing structure, OMC-ART builds a two-layer one, in which the first layer co-indexes the images by the key visual and textual features based on the generalized distributions of the clusters they belong to; while in the second layer, the data objects are co-indexed by their own feature distributions. Third, OMC-ART enables flexible multimodal searching by using either visual features, keywords, or a combination of both. Fourth, OMC-ART employs a ranking algorithm that does not need to go through the whole indexing system when only a limited number of images need to be retrieved. Experiments on two publicly accessible image datasets and a real-world e-commerce dataset demonstrate the efficiency and effectiveness of OMC-ART.

Discipline

Databases and Information Systems | Social Media | Theory and Algorithms

Research Areas

Data Science and Engineering

Publication

Adaptive resonance theory in social media data clustering: Roles, methodologies, and applications

First Page

155

Last Page

174

ISBN

9783030029852

Identifier

10.1007/978-3-030-02985-2_7

Publisher

Springer

City or Country

Cham

Embargo Period

12-17-2024

Additional URL

https://doi.org/10.1007/978-3-030-02985-2_7

Share

COinS