The whens and hows of learning to rank for web search
Tóm tắt
Từ khóa
Tài liệu tham khảo
Amati, G. (2003).Probabilistic models for information retrieval based on divergence from randomness. PhD thesis, Department of Computing Science, University of Glasgow.
Amati, G., Ambrosi, E., Bianchi, M., Gaibisso, C., & Gambosi, G. (2008). FUB, IASI-CNR and University of Tor Vergata at TREC 2007 Blog Track. In Proceedings of the 16th text retrieval conference, TREC ’07.
Arampatzis, A., Kamps, J., & Robertson, S. (2009). Where to stop reading a ranked list?: Threshold optimization using truncated score distributions. In Proceedings of the 32nd annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’09 (pp. 524–531). doi:10.1145/1571941.1572031.
Aslam, J. A., Kanoulas, E., Pavlu, V., Savev, S., & Yilmaz, E. (2009). Document selection methodologies for efficient and effective learning-to-rank. In Proceedings of the 32nd annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’09 (pp. 468–475). doi:10.1145/1571941.1572022.
Becchetti, L., Castillo, C., Donato, D., Leonardi, S., & Baeza-Yates, R. (2006). Link-based characterization and detection of Web spam. In: Proceedings of the 2nd international workshop on adversarial information retrieval on the web, AIRWeb.
Beitzel, S. M., Jensen, E. C., Chowdhury, A., Grossman, D., Frieder, O., & Goharian, N. (2004). Fusion of effective retrieval strategies in the same information retrieval system. Journal American Society of Information Science & Technology, 55(10), 859–868. doi:10.1002/asi.20012.
Broder, A. Z., Carmel, D., Herscovici, M., Soffer, A., & Zien. J. (2003). Efficient query evaluation using a two-level retrieval process. In Proceedings of the 12th ACM international conference on information and knowledge management, CIKM ’03 (pp. 426–434). doi:10.1145/956863.956944.
Buckley, C., & Voorhees, E. M. (2000). Evaluating evaluation measure stability. In Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’00 (pp. 33–40). doi:10.1145/345508.345543.
Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., & Hullender, G. (2005). Learning to rank using gradient descent. In Proceedings of the 22nd international conference on machine learning, ICML ’05 (pp. 89–96). doi:10.1145/1102351.1102363.
Cambazoglu, B. B., Zaragoza, H., Chapelle, O., Chen, J., Liao, C., Zheng, Z., & Degenhardt, J. (2010). Early exit optimizations for additive machine learned ranking systems. In Proceedings of the third ACM international conference on web search and data mining, WSDM ’10 (pp. 411–420). doi:10.1145/1718487.1718538.
Carterette, B., Fang, H., Pavlu, V., & Kanoulas, E. (2010). Million query track 2009 overview. In Proceedings of the 18th text retrieval conference, TREC ’09.
Castillo, C., Donato, D., Becchetti, L., Boldi, P., Leonardi, S., Santini, M., & Vigna, S. (2006). A reference collection for web spam. SIGIR Forum, 40(2), 11–24. doi:10.1145/1189702.1189703.
Chapelle, O., & Chang, Y. (2011). Yahoo! learning to rank challenge overview. In: Journal of Machine Learning Research Proceedings Track, 14, 1–24.
Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009). Expected reciprocal rank for graded relevance. In Proceeding of the 18th ACM international conference on information and knowledge management, CIKM ’09 (pp. 621–630). doi:10.1145/1645953.1646033.
Chapelle, O., Chang, Y., & Liu, T. Y. (2011). Future directions in learning to rank. Journal of Machine Learning Research Proceedings Track, 14, 91–100.
Clarke, C. L. A., Craswell, N., & Soboroff, I. (2010). Overview of the TREC 2009 Web track. In Proceedings of the 18th text retrieval conference (TREC 2009), TREC ’09.
Clarke, C. L. A., Craswell, N., & Soboroff, I. (2011). Overview of the TREC 2010 web track. In Proceedings of the 19th text retrieval conference, TREC ’10.
Coolican, H. (1999). Research methods and statistics in psychology. London: A Hodder Arnold Publication, Hodder & Stoughton. http://books.google.co.uk/books?id=XmfGQgAACAAJ.
Cormack, G. V., Smucker, M. D., & Clarke, C. L. A. (2011). Efficient and effective spam filtering and re-ranking for large Web datasets. Information Retrieval, 15(5), 441–465. doi:10.1007/s10791-011-9162-z.
Craswell, N., & Hawking, D. (2004). Overview of TREC-2004 web track. In Proceedings of the 13th text retrieval conference, TREC ’04.
Craswell, N., Robertson, S., Zaragoza, H., & Taylor, M. (2005). Relevance weighting for query independent evidence. In Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’05 (pp. 416–423). doi:10.1145/1076034.1076106.
Craswell, N., Jones, R., Dupret, G., & Viegas, E. (eds) (2009). Proceedings of the 2009 workshop on web search click data. doi:10.1145/1507509.
Craswell, N., Fetterly, D., Najork, M., Robertson, S., & Yilmaz, E. (2010). Microsoft research at TREC 2009. In Proceedings of the 18th Text REtrieval Conference, TREC ’09.
Croft, W. B. (2008). Learning about ranking and retrieval models. In Keynote, SIGIR 2007 workshop learning to rank for information retrieval (LR4IR).
Donmez, P., & Carbonell, J. G. (2009). Active sampling for rank learning via optimizing the area under the roc curve. In Proceedings of the 31th European conference on IR research on advances in information retrieval, ECIR ’09 (pp. 78–89). doi:10.1007/978-3-642-00958-7_10.
Donmez, P., Svore, K. M., & Burges, C. J. (2009). On the local optimality of LambdaRank. In: Proceedings of the 32nd annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’09 (pp. 460–467). doi:10.1145/1571941.1572021.
Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003). An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4, 933–969.
Friedman, J. H. (2000). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232. doi:10.1214/aos/1013203451.
Ganjisaffar, Y., Caruana, R., & Lopes, C. (2011). Bagging gradient-boosted trees for high precision, low variance ranking models. In: Proceedings of the 34th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’11 (pp. 85–94). doi:10.1145/2009916.2009932.
Hawking, D., Upstill, T., & Craswell, N. (2004). Toward better weighting of anchors. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’04 (pp. 512–513). doi:10.1145/1008992.1009096.
He, B., Macdonald, C., & Ounis, I. (2008). Retrieval sensitivity under training using different measures. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’08 (pp. 67–74). doi:10.1145/1390334.1390348.
Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. Transactions on Information Systems, 20(4), 422–446. doi:10.1145/582415.582418.
Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671–680. doi:10.1126/science.220.4598.671.
Kraaij, W., Westerveld, T., & Hiemstra, D. (2002). The importance of prior probabilities for entry page search. In Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’02 (pp. 27–34). doi:10.1145/564376.564383.
Li, H. (2011). Learning to rank for information retrieval and natural language processing. Synthesis lectures on human language technologies. San Rafael: Morgan & Claypool Publishers. doi:10.2200/S00348ED1V01Y201104HLT012.
Liu, T. Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225–331. doi:10.1561/1500000016.
Long, B., Chapelle, O., Zhang, Y., Chang, Y., Zheng, Z., & Tseng. B. (2010). Active learning for ranking through expected loss optimization. In: Proceedings of the 33rd annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’10 (pp. 267–274). doi:10.1145/1835449.1835495.
Macdonald, C., & Ounis, I. (2009). Usefulness of quality click-through data for training. In: Proceedings of the 2009 workshop on web search click data, WSCD ’09 (pp. 75–79). doi:10.1145/1507509.1507521.
Metzler, D. (2007). Automatic feature selection in the Markov random field model for information retrieval. In: Proceedings of the 16th ACM international conference on information and knowledge management, CIKM ’07 (pp. 253–262). doi:10.1145/1321440.1321478.
Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’05 (pp. 472–479). doi:10.1145/1076034.1076115.
Minka, T., & Robertson, S. (2008). Selection bias in the LETOR datasets. In: SIGIR 2007 workshop learning to rank for information retrieval (LR4IR).
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., & Lioma, C. (2006). Terrier: A high performance and scalable information retrieval platform. In: Proceedings of the 2nd workshop on open source information retrieval at SIGIR 2006, OSIR (pp. 18–25).
Page, L., Brin, S., Motwani, R., & Winograd, T. (1998). The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library technologies project.
Pederson, J. (2008). The machine learned ranking story. http://jopedersen.com/Presentations/The_MLR_Story.pdf, Accessed July 30, 2012.
Pederson, J. (2010). Query understanding at bing. In: Invited talk, SIGIR 2010 industry day.
Peng, J., Macdonald, C., He, B., Plachouras, V., & Ounis, I. (2007). Incorporating term dependency in the DFR framework. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’07 (pp. 843–844). doi:10.1145/1277741.1277937.
Piroi, F., & Zenz, V. (2011). Evaluating information retrieval in the intellectual property domain: The CLEF-IP campaign. In Current challenges in patent information retrieval, the information retrieval series, 29 (pp. 87–108). Berlin: Springer. doi:10.1007/978-3-642-19231-9_4.
Plachouras, V. (2006) Selective web information retrieval. PhD thesis. Department of Computing Science, University of Glasgow.
Plachouras, V., & Ounis, I. (2004). Usefulness of hyperlink structure for query-biased topic distillation. In Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’04 (pp. 448–455). doi:10.1145/1008992.1009069.
Plachouras, V., Ounis, I., & Amati, G. (2005). The static absorbing model for the web. Journal of Web Engineering, 4(2), 165–186.
Qin, T., Liu, T. Y., Xu, J., & Li, H. (2009). LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13(4), 347–374.
Robertson, S. (2008). On the optimisation of evaluation metrics. In Keynote, SIGIR 2008 workshop learning to rank for information retrieval (LR4IR).
Robertson, S., & Walker, S., Hancock-Beaulieu, M., Gull, A., Lau, M. (1992). Okapi at TREC. In Proceedings of the text retrieval conference, TREC-1.
Segalovich, I. (2010). Machine learning in search quality at Yandex. In Invited Talk, SIGIR 2010 industry day.
Tomlinson, S., & Hedin, B. (2011). Measuring effectiveness in the TREC legal track. In Current challenges in patent information retrieval, the information retrieval series (vol. 29, pp. 167–180). Berlin: Springer. doi:10.1007/978-3-642-19231-9_8.
Voorhees, E. M., & Harman, D. K. (2005). TREC: Experiment and evaluation in information retrieval. Cambridge: MIT Press. doi:10.1002/asi.20583.
Weinberger, K., Mohan, A., & Chen, Z. (2010). Tree ensembles and transfer learning. In Proceedings of the Yahoo! learning to rank challenge workshop at WWW 2010.
Wu, Q., Burges, C. J. C., Svore, K. M., & Gao, J. (2008). Ranking, boosting, and model adaptation. Technical Report MSR-TR-2008-109, Microsoft.
Xu, J., & Li, H. (2007). Adarank: A boosting algorithm for information retrieval. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’07 (pp. 391–398). doi:10.1145/1277741.1277809.
Yilmaz, E., & Robertson, S. (2010). On the choice of effectiveness measures for learning to rank. Information Retrieval, 13(3), 271–290. doi:10.1007/s10791-009-9116-x.
Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’01 (pp. 334–342). doi:10.1145/383952.384019.
Zhang, M., Kuang, D., Hua, G., Liu, Y., & Ma, S. (2009). Is learning to rank effective for web search? In Proceedings of SIGIR 2008 workshop learning to rank for information retrieval (LR4IR).