Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
7-2010
Abstract
Like traditional supervised and semi-supervised algorithms, learning to rank for information retrieval requires document annotations provided by domain experts. It is costly to annotate training data for different search domains and tasks. We propose to exploit training data annotated for a related domain to learn to rank retrieved documents in the target domain, in which no labeled data is available. We present a simple yet effective approach based on instance-weighting scheme. Our method first estimates the importance of each related-domain document relative to the target domain. Then heuristics are studied to transform the importance of individual documents to the pairwise weights of document pairs, which can be directly incorporated into the popular ranking algorithms. Due to importance weighting, ranking model trained on related domain is highly adaptable to the data of target domain. Ranking adaptation experiments on LETOR3.0 dataset [27] demonstrate that with a fair amount of related-domain training data, our method significantly outperforms the baseline without weighting, and most of time is not significantly worse than an "ideal" model directly trained on target domain.
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the 33rd Annual International ACM SIGIR7 Conference on Research and Development in Information Retrieval (SIGIR 2010)
First Page
162
Last Page
169
Identifier
10.1145/1835449.1835478
Publisher
ACM Press
City or Country
Geneva, Switzerland
Citation
GAO, Wei; CAI, Peng; WONG, Kam-Fai; and ZHOU, Aoying.
Learning to rank only using training data from related domain. (2010). Proceedings of the 33rd Annual International ACM SIGIR7 Conference on Research and Development in Information Retrieval (SIGIR 2010). 162-169.
Available at: https://ink.library.smu.edu.sg/sis_research/4597
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/1835449.1835478