Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

3-2010

Abstract

Recognizing the alternative ways people use to reference an entity, is important for many Web applications that query structured data. In such applications, there is often a mismatch between how content creators describe entities and how different users try to retrieve them. In this paper, we consider the problem of determining whether a candidate query approximately matches with an entity. We propose an off-line, data-driven, bottom-up approach that mines query logs for instances where Web content creators and Web users apply a variety of strings to refer to the same Web pages. This way, given a set of strings that reference entities, we generate an expanded set of equivalent strings for each entity. The proposed method is verified with experiments on real-life data sets showing that we can dramatically increase the queries that can be matched.

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Research Areas

Data Science and Engineering

Publication

2010 IEEE 26th International Conference on Data Engineering ICDE: Long Beach, CA, March 1-6: Proceedings

First Page

713

Last Page

716

ISBN

9781424454457

Identifier

10.1109/ICDE.2010.5447817

Publisher

IEEE Computer Society

City or Country

Los Alamitos, CA

Copyright Owner and License

Authors

Additional URL

https://doi.ieeecomputersociety.org/10.1109/ICDE.2010.5447817

Share

COinS