Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

4-2009

Abstract

Ranking for multilingual information retrieval (MLIR) is a task to rank documents of different languages solely based on their relevancy to the query regardless of query’s language. Existing approaches are focused on combining relevance scores of different retrieval settings, but do not learn the ranking function directly. We approach Web MLIR ranking within the learning-to-rank (L2R) framework. Besides adopting popular L2R algorithms to MLIR, a joint ranking model is created to exploit the correlations among documents, and induce the joint relevance probability for all the documents. Using this method, the relevant documents of one language can be leveraged to improve the relevance estimation for documents of different languages. A probabilistic graphical model is trained for the joint relevance estimation. Especially, a hidden layer of nodes is introduced to represent the salient topics among the retrieved documents, and the ranks of the relevant documents and topics are determined collaboratively while the model approaching to its thermal equilibrium. Furthermore, the model parameters are trained under two settings: (1) optimize the accuracy of identifying relevant documents; (2) directly optimize information retrieval evaluation measures, such as mean average precision. Benchmarks show that our model significantly outperforms the existing approaches for MLIR tasks.

Discipline

Databases and Information Systems

Research Areas

Data Science and Engineering

Publication

Proceedings of the 31st European Conference on Information Retrieval (ECIR 2009)

First Page

114

Last Page

125

Identifier

10.1007/978-3-642-00958-7_13

Publisher

LNCS, Springer

City or Country

Toulouse, France

Additional URL

https://doi.org/10.1007/978-3-642-00958-7_13

Share

COinS