On the relationship between query characteristics and IR functions retrieval bias
Tóm tắt
Bias quantification of retrieval functions with the help of document retrievability scores has recently evolved as an important evaluation measure for recall‐oriented retrieval applications. While numerous studies have evaluated retrieval bias of retrieval functions, solid validation of its impact on realistic types of queries is still limited. This is due to the lack of well‐accepted criteria for query generation for estimating retrievability. Commonly, random queries are used for approximating documents retrievability due to the prohibitively large query space and time involved in processing all queries. Additionally, a cumulative retrievability score of documents over all queries is used for analyzing retrieval functions (retrieval) bias. However, this approach does not consider the difference between different query characteristics (QCs) and their influence on retrieval functions' bias quantification. This article provides an in‐depth study of retrievability over different QCs. It analyzes the correlation of lower/higher retrieval bias with different query characteristics. The presence of strong correlation between retrieval bias and query characteristics in experiments indicates the possibility of determining retrieval bias of retrieval functions without processing an exhaustive query set. Experiments are validated on TREC Chemical Retrieval Track consisting of 1.2 million patent documents.
Từ khóa
Tài liệu tham khảo
Arampatzis A. Kamps J. Kooken M. &Nussbaum N.(2007).Access to legal documents: Exact match best match and combinations. Proceedings of the Sixteenth Text Retrieval Conference (TREC 2007) Gaithersburg Maryland.
Owens C.(2009).A study of the relative bias of web search engines toward news media providers (master's thesis.) University of Glasgow.
Tague J., 1981, Proceedings of the Third Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 236
Zhai C.(2002).Risk minimization and language modeling in text retrieval. (PhD thesis.) Carnegie Mellon University.