A meta-learning configuration framework for graph-based similarity search indexes

Information Systems - Tập 112 - Trang 102123 - 2023
Rafael S. Oyamada1, Larissa C. Shimomura2, Sylvio Barbon3, Daniel S. Kaster4
1University of Milan, Milan, Italy
2Eindhoven University of Technology, Eindhoven, Netherlands
3University of Trieste, Trieste, Italy
4University of Londrina, Londrina, Parana, Brazil

Tài liệu tham khảo

Zezula, 2006, vol. 32 Navarro, 2002, Searching in metric spaces by spatial approximation, VLDB J., 11, 28, 10.1007/s007780200060 Jr., 2000, Slim-trees: High performance metric trees minimizing overlap between nodes, vol. 1777, 51 Vieira, 2004, DBM-tree: A dynamic metric access method sensitive to local density data Chen, 2017, Efficient metric indexing for similarity search and similarity joins, IEEE Trans. Knowl. Data Eng., 29, 556, 10.1109/TKDE.2015.2506556 Indyk, 1998, Approximate nearest neighbors: Towards removing the curse of dimensionality, 604 Zhang, 2013, Fast kNN graph construction with locality sensitive hashing, vol. 8189, 660 Wang, 2014 Liu, 2019, I-LSH: I/O efficient c-approximate nearest neighbor search in high-dimensional space, 1670 Jafari, 2021 Esuli, 2012, Use of permutation prefixes for efficient and scalable approximate similarity search, Inf. Process. Manag., 48, 889, 10.1016/j.ipm.2010.11.011 Amato, 2014, MI-File: using inverted files for scalable approximate similarity search, Multimedia Tools Appl., 71, 1333, 10.1007/s11042-012-1271-1 Naidan, 2015, Permutation search methods are efficient, yet faster search is possible, PVLDB, 8, 1618 Novak, 2016, PPP-codes for large-scale similarity searching, Trans. Large Scale Data Knowl. Centered Syst., 24, 61 Figueroa, 2018, New permutation dissimilarity measures for proximity searching, vol. 11223, 122 Jaromczyk, 1992, Relative neighborhood graphs and their relatives, Proc. IEEE, 80, 1502, 10.1109/5.163414 Navarro, 2009, Dynamic spatial approximation trees for massive data, 81 Paredes, 2005, Using the k-nearest neighbor graph for proximity searching in metric spaces, vol. 3772, 127 Malkov, 2014, Approximate nearest neighbor algorithm based on navigable small world graphs, Inf. Syst., 45, 61, 10.1016/j.is.2013.10.006 Iwasaki, 2018 Fu, 2019, Fast approximate nearest neighbor search with the navigating spreading-out graph, Proc. VLDB Endow., 12, 461, 10.14778/3303753.3303754 Shimomura, 2021, A survey on graph-based methods for similarity searches in metric spaces, Inf. Syst., 95, 10.1016/j.is.2020.101507 Falchi, 2007, A content–addressable network for similarity search in metric spaces, 98 Batko, 2006, M-grid: Similarity searching in grid, 17 Hajebi, 2011, Fast approximate nearest-neighbor search with k-nearest neighbor graph, 1312 Aumüller, 2020, ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms, Inf. Syst., 87, 10.1016/j.is.2019.02.006 Oyamada, 2020, Towards proximity graph auto-configuration: An approach based on meta-learning, vol. 12245, 93 Paredes, 2006, Practical construction of k-nearest neighbor graphs in metric spaces, vol. 4007, 85 Arya, 1998, An optimal algorithm for approximate nearest neighbor searching fixed dimensions, J. ACM, 45, 891, 10.1145/293347.293348 Dong, 2011, Efficient k-nearest neighbor graph construction for generic similarity measures, 577 Aoyama, 2009, Fast similarity search in small-world networks, vol. 207, 185 Barton, 2010, Toward self-organizing search systems, 49 Kleinberg, 2000, The small-world phenomenon: an algorithmic perspective, 163 Malkov, 2020, Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs, IEEE Trans. Pattern Anal. Mach. Intell., 42, 824, 10.1109/TPAMI.2018.2889473 Brazdil, 2009 Smith-Miles, 2008, Towards insightful algorithm selection for optimisation using meta-learning concepts, 4118 Priya, 2012, Using genetic algorithms to improve prediction of execution times of ML tasks, vol. 7208, 196 Lemke, 2015, Metalearning: a survey of trends and technologies, Artif. Intell. Rev., 44, 117, 10.1007/s10462-013-9406-y Mantovani, 2015, To tune or not to tune: Recommending when to adjust SVM hyper-parameters via meta-learning, 1 Soares, 2004, A meta-learning method to select the kernel width in support vector regression, Mach. Learn., 54, 195, 10.1023/B:MACH.0000015879.28004.9b Campos, 2019, Machine learning hyperparameter selection for contrast limited adaptive histogram equalization, EURASIP, 2019, 59 Sun, 2013, Pairwise meta-rules for better meta-learning-based algorithm ranking, Mach. Learn., 93, 141, 10.1007/s10994-013-5387-y Vanschoren, 2018 Korn, 2001, On the “dimensionality curse” and the “self-similarity blessing”, IEEE Trans. Knowl. Data Eng., 96, 10.1109/69.908983 Houle, 2017, Local intrinsic dimensionality I: An extreme-value-theoretic foundation for similarity applications, vol. 10609, 64 Aumüller, 2019, Benchmarking nearest neighbor search: Influence of local intrinsic dimensionality and result diversity in real-world datasets, vol. 2436, 14 Aggarwal, 2001, On the surprising behavior of distance metrics in high dimensional spaces, vol. 1973, 420 Costa, 2004, Geodesic entropic graphs for dimension and entropy estimation in manifold learning, IEEE Trans. Signal Process., 52, 2210, 10.1109/TSP.2004.831130 François, 2007, The concentration of fractional distances, IEEE Trans. Knowl. Data Eng., 19, 873, 10.1109/TKDE.2007.1037 Levina, 2004, Maximum likelihood estimation of intrinsic dimension, 777 Lorena, 2019, How complex is your classification problem?: A survey on measuring classification complexity, ACM Comput. Surv., 52, 107:1 Lin, 1994, The TV-tree: An index structure for high-dimensional data, VLDB J., 3, 517, 10.1007/BF01231606 Berchtold, 1996, The X-tree : An index structure for high-dimensional data, 28 Chakrabarti, 2000, Local dimensionality reduction: A new approach to indexing high dimensional spaces, 89 Echihabi, 2019, Return of the lernaean hydra: Experimental evaluation of data series approximate similarity search, Proc. VLDB Endow., 13, 403, 10.14778/3368289.3368303 Zhou, 2019, Möbius transformation for fast inner product search on graph, 8216 Zhao, 2020, SONG: Approximate nearest neighbor search on GPU, 1033 Augustine, 2021, A generalized approach for reducing expensive distance calls for a broad class of proximity problems, 142 Li, 2016 Wang, 2021, A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search, Proc. VLDB Endow., 14, 1964, 10.14778/3476249.3476255 Muja, 2009, Fast approximate nearest neighbors with automatic algorithm configuration, 331 Zhou, 2020, Database meets artificial intelligence: A survey, IEEE Trans. Knowl. Data Eng., 1 Baranchuk, 2019 Baranchuk, 2019, Learning to route in similarity graphs, vol. 97, 475 Li, 2020, Improving approximate nearest neighbor search through learned adaptive early termination, 2539 Antol, 2021, Learned metric index - proposition of learned indexing for unstructured data, Inf. Syst., 100, 10.1016/j.is.2021.101774 Slanináková, 2021, Data-driven learned metric index: An unsupervised approach, vol. 13058, 81 Hünemörder, 2021, Towards a learned index structure for approximate nearest neighbor search query processing, vol. 13058, 95 Aumüller, 2021, The role of local dimensionality measures in benchmarking nearest neighbor search, Inf. Syst., 101, 10.1016/j.is.2021.101807 Barber, 1996, The quickhull algorithm for convex hulls, ACM Trans. Math. Softw., 22, 469, 10.1145/235815.235821 Ester, 1996, A density-based algorithm for discovering clusters in large spatial databases with noise, 226 Bolettieri, 2009 Boytsov, 2013, Engineering efficient and effective non-metric space library, vol. 8199, 280