Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
7-2008
Abstract
Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and summarization. In this paper, we study the problem of comments-oriented document summarization and aim to summarize a Web document (e.g., a blog post) by considering not only its content, but also the comments left by its readers. We identify three relations (namely, topic, quotation, and mention) by which comments can be linked to one another, and model the relations in three graphs. The importance of each comment is then scored by: (i) graph-based method, where the three graphs are merged into a multi-relation graph; (ii) tensor-based method, where the three graphs are used to construct a 3rd-order tensor. To generate a comments-oriented summary, we extract sentences from the given Web document using either feature-biased approach or uniform-document approach. The former scores sentences to bias keywords derived from comments; while the latter scores sentences uniformly with comments. In our experiments using a set of blog posts with manually labeled sentences, our proposed summarization methods utilizing comments showed significant improvement over those not using comments. The methods using feature-biased sentence extraction approach were observed to outperform that using uniform-document approach.
Keywords
Blog, Comments, Document summarization, Graph-based scoring, Tensor-based scoring
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Publication
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
First Page
291
Last Page
298
ISBN
9781605581644
Identifier
10.1145/1390334.1390385
Publisher
ACM
Citation
HU, Meishan; SUN, Aixin; and LIM, Ee Peng.
Comments-oriented document summarization: Understanding documents with readers' feedback. (2008). SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. 291-298.
Available at: https://ink.library.smu.edu.sg/sis_research/330
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
http://doi.org/10.1145/1390334.1390385
Included in
Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons