Parallel rare term vector replacement: Fast and effective dimensionality reduction for text
Tài liệu tham khảo
Aggarwal, 2010, The generalized dimensionality reduction problem, 607
Aizerman, 1964, Theoretical foundations of the potential function method in pattern recognition learning, Automat. Remote Control, 25, 821
Bartell, 1992, Latent semantic indexing is an optimal special case of multidimensional scaling, 161
T. Berka, M. Vajteršic, Dimensionality reduction for information retrieval using vector replacement of rare terms, in: Proc. TM, 2011.
Berry, 1993, Massively-parallel implementations of Lanczos algorithms for computing the SVD of large sparse matrices, 437
Berry, 1999, Matrices, vector spaces, and information retrieval, SIAM Rev., 41, 335, 10.1137/S0036144598347035
Berry, 2006, vol. 184, 117
Campoy, 2009, Dimensionality reduction by self organizing maps that preserve distances in output space, 2976
Cancho, 2003, Least effort and the origins of scaling in human language, Proc. Natl. Acad. Sci. USA, 100, 788, 10.1073/pnas.0335980100
Chen, 2009, Lanczos vectors versus singular vectors for effective dimension reduction, IEEE Trans. Knowl. Data Eng., 21, 1091, 10.1109/TKDE.2008.228
Cox, 2001
Cuzzocrea, 2006, Accuracy control in compressed multidimensional data cubes for quality of answer-based OLAP tools, 301
Cuzzocrea, 2006, A hierarchy-driven compression technique for advanced OLAP visualization of multidimensional data cubes, vol. 4081, 106
Deerwester, 1990, Indexing by latent semantic analysis, J. Soc. Inf. Sci., 41, 391, 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
Dhillon, 2001, Concept decompositions for large sparse text data using clustering, Mach. Learn., 42, 143, 10.1023/A:1007612920971
Eckart, 1936, The approximation of one matrix by another of lower rank, Psychometrika, 1, 211, 10.1007/BF02288367
Faloutsos, 1995, Fastmap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets, 163
E. Gallopoulos, D. Zeimpekis, CLSI: a flexible approximation scheme from clustered term-document matrices, in: Proc. SDM, 2005, pp. 631–635.
Hofmann, 1999, Probabilistic latent semantic indexing, 50
M.P. Holmes, A.G. Gray, C.L. Isbell, G. Tech, QUIC-SVD: fast svd using cosine trees, in: Proc. NIPS, 2009, pp. 673–680.
Hussain, 2010, Text categorization using word similarities based on higher order co-occurrences, 1
Hyvarinen, 2001
Janecek, 2010, Utilizing nonnegative matrix factorization for e-mail classification problems
Johnson, 2007
Jolliffe, 2002
Karypis, 2000, Fast supervised dimensionality reduction algorithm with applications to document categorization & retrieval, 12
Kobayashi, 2002, Matrix computations for information retrieval and major and outlier cluster detection, J. Comput. Appl. Math., 149, 119, 10.1016/S0377-0427(02)00524-1
Lan, 2005, A comprehensive comparative study on term weighting schemes for text categorization with support vector machines, 1032
Langville, 2008, Nonnegative matrix factorization for document classification, 339
Lewis, 2004, RCV1: a new benchmark collection for text categorization research, J. Mach. Learn. Res., 5, 361
Mao, 2007, The phrase-based vector space model for automatic retrieval of free-text medical documents, Data Knowl. Eng., 61, 76, 10.1016/j.datak.2006.02.008
MPI forum, MPI: a message-passing interface standard, Tech. Rep., Knoxville, TN, USA, 1994.
Paatero, 1994, Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values, Environmetrics, 5, 111, 10.1002/env.3170050203
Powers, 1998, Applications and explanations of Zipf’s law, 151
Roweis, 2000, Nonlinear dimensionality reduction by locally linear embedding, Science, 290, 2323, 10.1126/science.290.5500.2323
Sakellaridi, 2008, Graph-based multilevel dimensionality reduction with applications to eigenfaces and latent semantic indexing, 194
Schölkopf, 1998, Nonlinear component analysis as a Kernel eigenvalue problem, Neural Comput., 10, 1299, 10.1162/089976698300017467
Tenenbaum, 2000, A global geometric framework for nonlinear dimensionality reduction, Science, 290, 2319, 10.1126/science.290.5500.2319
Tsatsaronis, 2009, A generalized vector space model for text retrieval based on semantic relatedness, 70
Wong, 1985, Generalized vector spaces model in information retrieval, 18