Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
10-2008
Abstract
Many applications on blog search and mining often meet the challenge of handling huge volume of blog data, in which one single blog could contain hundreds or even thousands of entries. We investigate novel techniques for profiling blogs by selecting a subset of representative entries for each blog. We propose two principles for guiding the entry selection task: representativeness and diversity. Further, we formulate the entry selection task into a combinatorial optimization problem and propose a greedy yet effective algorithm for finding a good approximate solution by exploiting the theory of submodular functions. We suggest blog classification for judging the performance of the proposed entry selection techniques and evaluate their performance on a real blog dataset, in which encouraging results were obtained.
Keywords
Blog profiling, Blog classification, Entry selection
Discipline
Computer Sciences | Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
CIKM '08: Proceedings of the ACM 17th Conference on Information and Knowledge Management: Napa Valley, CA, October 2-30
First Page
1387
Last Page
1388
ISBN
9781595939913
Identifier
10.1145/1458082.1458293
Publisher
ACM
City or Country
New York
Citation
ZHUANG, Jinfeng; HOI, Steven C. H.; SUN, Aixin; and JIN, Rong.
Representative entry selection for profiling blogs. (2008). CIKM '08: Proceedings of the ACM 17th Conference on Information and Knowledge Management: Napa Valley, CA, October 2-30. 1387-1388.
Available at: https://ink.library.smu.edu.sg/sis_research/2382
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/1458082.1458293