Diversifying recommendations on sequences of sets

The VLDB Journal - Tập 32 - Trang 283-304 - 2022
Sepideh Nikookar1, Mohammadreza Esfandiari1, Ria Mae Borromeo2, Paras Sakharkar1, Sihem Amer-Yahia3, Senjuti Basu Roy1
1New Jersey Institute of Technology, Newark, USA
2University of the Philippines, Los Baños, Philippines
3CNRS, Univ. Grenoble Alpes, Grenoble, France

Tóm tắt

Diversifying recommendations on a sequence of sets (or sessions) of items captures a variety of applications. Notable examples include recommending online music playlists, where a session is a channel and multiple channels are listened to in sequence, or recommending tasks in crowdsourcing, where a session is a set of tasks and multiple task sessions are completed in sequence. Item diversity can be defined in more than one way, e.g., as a genre diversity for music, or as a function of reward in crowdsourcing. A user who engages in multiple sessions may intend to experience diversity within and/or across sessions. Intra session diversity is set-based, whereas Inter session diversity is naturally sequence-based. This novel formulation gives rise to four bi-objective problems with the goal of minimizing or maximizing Inter and Intra diversities. We prove hardness and develop efficient algorithms with theoretical guarantees. Our experiments with human subjects on two real datasets show that our diversity formulations do serve different user needs and yield high user satisfaction. Our large-scale experiments on real and synthetic data empirically demonstrate that our solutions satisfy our theoretical bounds and are highly scalable, compared to baselines.

Tài liệu tham khảo

(2019) Figure eight—data for everyone. https://www.figure-eight.com/data-for-everyone/ Abbar, S., Amer-Yahia, S., Indyk, P., Mahabadi, S.: Real-time recommendation of diverse related articles. In: 22nd International World Wide Web Conference, WWW’13, Rio de Janeiro, Brazil, May 13–17, 2013, pp. 1–12 (2013) Aipe, A., Gadiraju, U.: Similarhits: revealing the role of task similarity in microtask crowdsourcing. In: HT, pp. 115–122 (2018) Alsayasneh, M., Amer-Yahia, S., Gaussier, E., Leroy, V., Pilourdault, J., Borromeo, R.M., Toyama, M., Renders, J.M.: Personalized and diverse task composition in crowdsourcing. IEEE Trans. Knowl. Data Eng. 30(1), 128–141 (2017) Amer-Yahia, S., Gaussier, E., Leroy, V., Pilourdault, J., Borromeo, R.M., Toyama, M.: Task composition in crowdsourcing. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 194–203. IEEE (2016) Anagnostopoulos, A., Broder, A.Z., Carmel, D.: Sampling search-engine results. World Wide Web 9(4), 397–429 (2006) Andreev, K., Racke, H.: Balanced graph partitioning. Theory Comput. Syst. 39(6), 929–939 (2006) Angel, A., Koudas, N.: Efficient diversity-aware search. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pp. 781–792 (2011) Bertin-Mahieux, T., Ellis, D.P., Whitman, B., Lamere, P.: The million song dataset. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011) (2011) Bradley, P.S., Bennett, K.P., Demiriz, A.: Constrained k-means clustering. Microsoft Res. 20, 66 (2000) Carbonell, J.G., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. SIGIR 98, 335–336 (1998) Chandler, D., Kapelner, A.: Breaking monotony with meaning: motivation in crowdsourcing markets (2012). CoRR arXiv:1210.0962 Chen, Z., Li, T.: Addressing diverse user preferences in sql-query-result navigation. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12–14, 2007, pp. 641–652 (2007) Cieliebak, M., Eidenbenz, S., Pagourtzis, A., Schlude, K.: On the complexity of variations of equal sum subsets. Nord J. Comput. 14(3), 151–172 (2008) Cressie, N., Whitford, H.: How to use the two sample t-test. Biometr. J. 28(2), 131–148 (1986) Dai, P., Rzeszotarski, J.M., Paritosh, P., Chi, E.H.: And now for something completely different: improving crowdsourcing workflows with micro-diversions. In: ACM CSCW, pp. 628–638 (2015) Difallah, D., Filatova, E., Ipeirotis, P.: Demographics and dynamics of mechanical turk workers. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 135–143. ACM (2018) Difallah, D.E., Catasta, M., Demartini, G., Cudré-Mauroux, P.: Scaling-up the crowd: micro-task pricing schemes for worker retention and latency improvement. In: Second AAAI Conference on Human Computation and Crowdsourcing (2014) El-Arini, K., Veda, G., Shahaf, D., Guestrin, C.: Turning down the noise in the blogosphere. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28–July 1, 2009, pp. 289–298 (2009) Esfandiari, M., Borromeo, R.M., Nikookar, S., Sakharkar, P., Amer-Yahia, S., Basu Roy, S.: Multi-session diversity to improve user satisfaction in web applications. Proc. Web Conf. 2021, 1928–1936 (2021) Fan, J., Lu, M., Ooi, B.C., Tan, W.C., Zhang, M.: A hybrid machine-crowdsourcing system for matching web tables. In: 2014 IEEE 30th International Conference on Data Engineering, pp. 976–987. IEEE (2014) Fan, J., Li, G., Ooi, B.C., Tan, K.l., Feng, J.: icrowd: an adaptive crowdsourcing framework. In: SIGMOD, pp. 1015–1030 (2015) Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman (1979) Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293–306 (1985) Han, L., Roitero, K., Gadiraju, U., Sarasua, C., Checco, A., Maddalena, E., Demartini, G.: All those wasted hours: on task abandonment in crowdsourcing. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, WSDM 2019, Melbourne, VIC, Australia, February 11–15, 2019, pp. 321–329 (2019) Hariri, N., Mobasher, B., Burke, R.: Context-aware music recommendation based on latenttopic sequential patterns. In: Proceedings of the Sixth ACM Conference on Recommender Systems, pp. 131–138 (2012) Hata, K., Krishna, R., Li, F., Bernstein, M.S.: A glimpse far into the future: understanding long-term crowd worker quality. In: Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW 2017), Portland, OR, USA, February 25–March 1, 2017, pp. 889–901 (2017) Ho, C., Vaughan, J.W.: Online task assignment in crowdsourcing markets. In: AAAI (2012) Ho, C., Jabbari, S., Vaughan, J.W.: Adaptive task assignment for crowdsourced classification. In: ICML, pp. 534–542 (2013) Jain, A., Sarda, P., Haritsa, J.R.: Providing diversity in k-nearest neighbor query results. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 404–413. Springer (2004) Kyriakidi, M., Stefanidis, K., Ioannidis, Y.: On achieving diversity in recommender systems. In: Proceedings of the ExploreDB’17, pp. 1–6 (2017) Leiserson, C.E., Rivest, R.L., Cormen, T.H., Stein, C.: Introduction to Algorithms, vol. 6. MIT Press, Cambridge (2001) Michiels, W., Korst, J., Aarts, E., Van Leeuwen, J.: Performance ratios for the differencing method applied to the balanced number partitioning problem. In: Annual Symposium on Theoretical Aspects of Computer Science, pp. 583–595. Springer (2003) Nemhauser, G.L., Wolsey, L.A., Fisher, M.L.: An analysis of approximations for maximizing submodular set functions-i. Math. Program. 14(1), 265–294 (1978) Pilourdault, J., Amer-Yahia, S., Lee, D., Roy, S.: Motivation-aware task assignment in crowdsourcing. In: EDBT (2017) Punnen, A., Margot, F., Kabadi, S.: Tsp heuristics: domination analysis and complexity. Algorithmica 35(2), 111–127 (2003) Puthiya Parambath, S.A., Usunier, N., Grandvalet, Y.: A coverage-based approach to recommendation diversity on similarity graph. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp. 15–22 (2016) Qin, L., Zhu, X.: Promoting diversity in recommendation by entropy regularizer. In: Twenty-Third International Joint Conference on Artificial Intelligence (2013) Rahman, H., Roy, S.B., Thirumuruganathan, S., Amer-Yahia, S., Das, G.: Optimized group formation for solving collaborative tasks. VLDB J. 28(1), 1–23 (2019) Rosenkrantz, D.J., Tayi, G.K., Ravi, S.: Facility dispersion problems under capacity and cost constraints. J. Combin. Optim. 4(1), 7–33 (2000) Rzeszotarski, J.M., Chi, E., Paritosh, P., Dai, P.: Inserting micro-breaks into crowdsourcing workflows. In: First AAAI Conference on Human Computation and Crowdsourcing (2013) Stoline, M.R.: The status of multiple comparisons: simultaneous estimation of all pairwise comparisons in one-way Anova designs. Am. Stat. 35(3), 134–141 (1981) Stratigi, M., Nummenmaa, J., Pitoura, E., Stefanidis, K.: Fair sequential group recommendations. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing, pp. 1443–1452 (2020) SurveyMonkey: Calculating the number of respondents you need (1999). https://help.surveymonkey.com/articles/en_US/kb/How-many-respondents-do-I-need Vargas, S., Baltrunas, L., Karatzoglou, A., Castells, P.: Coverage, redundancy and size-awareness in genre diversity for recommender systems. In: Proceedings of the 8th ACM Conference on Recommender Systems, pp. 209–216 (2014) Volkovs, M., Rai, H., Cheng, Z., Wu, G., Lu, Y., Sanner, S.: Two-stage model for automatic playlist continuation at scale. Proc. ACM Recomm. Syst. Chall. 2018, 1–6 (2018) Wang, D., Deng, S., Xu, G.: Sequence-based context-aware music recommendation. Inf. Retr. J. 21(2–3), 230–252 (2018) Yu, C., Lakshmanan, L., Amer-Yahia, S.: It takes variety to make a world: diversification in recommender systems. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 368–378 (2009) Zhang, M., Hurley, N.: Avoiding monotony: improving the diversity of recommendation lists. In: Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 123–130 (2008) Zheng, Y., Wang, J., Li, G., Cheng, R., Feng, J.: QASCA: a quality-aware task assignment system for crowdsourcing applications. In: SIGMOD, pp. 1031–1046 (2015) Ziegler, C., McNee, S.M., Konstan, J.A., Lausen, G.: Improving recommendation lists through topic diversification. In: Proceedings of the 14th International Conference on World Wide Web, WWW 2005, Chiba, Japan, May 10–14, 2005, pp. 22–32 (2005)