Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
3-2019
Abstract
Massive volumes of data continuously generated on social platforms have become an important information source for users. A primary method to obtain fresh and valuable information from social streams is social search. Although there have been extensive studies on social search, existing methods only focus on the relevance of query results but ignore the representativeness. In this paper, we propose a novel Semantic and Influence aware k-Representative (k-SIR) query for social streams based on topic modeling. Specifically, we consider that both user queries and elements are represented as vectors in the topic space. A k-SIR query retrieves a set of k elements with the maximum representativeness over the sliding window at query time w.r.t. the query vector. The representativeness of an element set comprises both semantic and influence scores computed by the topic model. Subsequently, we design two approximation algorithms, namely MULTI-TOPIC THRESHOLDSTREAM (MTTS) and MULTI-TOPIC THRESHOLDDESCEND (MTTD), to process k-SIR queries in real-time. Both algorithms leverage the ranked lists maintained on each topic for k-SIR processing with theoretical guarantees. Extensive experiments on real-world datasets demonstrate the effectiveness of k-SIR query compared with existing methods as well as the efficiency and scalability of our proposed algorithms for k-SIR processing.
Keywords
Approximation algorithms, Database systems, Semantics, Vector spaces, Information sources, Query results, Query vectors, Real-world datasets, Sliding Window, Social streams, Theoretical guarantees, Topic Modeling, Query processing
Discipline
Databases and Information Systems | Theory and Algorithms
Research Areas
Data Science and Engineering
Publication
Advances in Database Technology: Proceedings of the 22nd International Conference on Extending Database Technology EDBT 2019, March 26-29, Lisbon, Portugal
First Page
181
Last Page
192
ISBN
9783893180813
Identifier
10.5441/002/edbt.2019.17
Publisher
Open Proceedings
City or Country
Konstanz, Germany
Citation
WANG, Yanhao; LI, Yuchen; and TAN, Kianlee.
Semantic and influence aware k-representative queries over social streams. (2019). Advances in Database Technology: Proceedings of the 22nd International Conference on Extending Database Technology EDBT 2019, March 26-29, Lisbon, Portugal. 181-192.
Available at: https://ink.library.smu.edu.sg/sis_research/4371
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.5441/002/edbt.2019.17