Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
7-2016
Abstract
In recent years there has been a growing interest in text quantification, a supervised learning task where the goal is to accurately estimate, in an unlabelled set of items, the prevalence (or "relative frequency") of each class c in a predefined set C. Text quantification has several applications, and is a dominant concern in fields such as market research, the social sciences, political science, and epidemiology. In this paper we tackle, for the first time, the problem of ordinal text quantification, defined as the task of performing text quantification when a total order is defined on the set of classes; estimating the prevalence of "five stars" reviews in a set of reviews of a given product, and monitoring this prevalence across time, is an example application. We present OQT, a novel tree-based OQ algorithm, and discuss experimental results obtained on a dataset of tweets classified according to sentiment strength.
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the 38th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016)
First Page
937
Last Page
940
Identifier
10.1145/2911451.2914749
Publisher
ACM Press
City or Country
Pisa, Italy
Citation
MARTINO, Giovanni Da San; GAO, Wei; and SEBASTIANI, Fabrizio.
Ordinal text quantification. (2016). Proceedings of the 38th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016). 937-940.
Available at: https://ink.library.smu.edu.sg/sis_research/4569
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/2911451.2914749