Selection of relevant features and examples in machine learning
Tài liệu tham khảo
Aha, 1990, A study of instance-based algorithms for supervised learning tasks: mathematical, empirical and psychological evaluations
Aha, 1996, A comparative evaluation of sequential feature selection algorithms
Almuallim, 1991, Learning with many irrelevant features, 547
Angluin, 1987, Learning regular sets from queries and counterexamples, Inform. and Comput., 75, 87, 10.1016/0890-5401(87)90052-6
Angluin, 1993, Learning read-once formulas with queries, J. ACM, 40, 185, 10.1145/138027.138061
Armstrong, 1993, Webwatcher: a learning apprentice for the World Wide Web
Baluja, 1997, Dynamic relevance: vision-based focus of attention using artificial neural networks (Technical Note), Artificial Intelligence, 97, 381, 10.1016/S0004-3702(97)00065-9
Blum, 1992, Learning Boolean functions in an infinite attribute space, Machine Learning, 9, 373, 10.1007/BF00994112
Blum, 1995, Empirical support for winnow and weighted-majority based algorithms: results on a calendar scheduling domain, 64
Blum, 1994, Weakly learning DNF and characterizing statistical query learning using Fourier analysis, 253
Blum, 1995, Learning in the presence of finitely or infinitely many irrelevant attributes, J. Comput. System Sci., 50, 32, 10.1006/jcss.1995.1004
Blum, 1993, Learning an intersection of k halfspaces over a uniform distribution, 312
Blumer, 1987, Occam's razor, Inform. Process. Lett., 24, 377, 10.1016/0020-0190(87)90114-1
Blumer, 1989, Learnability and the Vapnik-Chervonenkis dimension, J. ACM, 36, 929, 10.1145/76359.76371
Breiman, 1984
Bshouty, 1993, Exact learning via the monotone theory, 302
Cardie, 1993, Using decision trees to improve case-based learning, 25
Caruana, 1994, Greedy attribute selection, 28
Caruana, 1994, How useful is relevance?, 25
Catlett, 1992, Peepholing: choosing attributes efficiently for megainduction, 49
Cesa-Bianchi, 1993, How to use expert advice, 382
Clark, 1989, The CN2 induction algorithm, Machine Learning, 3, 261, 10.1007/BF00116835
Cohn, 1996, Active learning with statistical models, J. Artif. Intell. Research, 4, 129, 10.1613/jair.295
Comon, 1994, Independent component analysis: a new concept, Signal Process., 36, 287, 10.1016/0165-1684(94)90029-9
Cover, 1967, Nearest neighbor pattern classification, IEEE Trans. Inform. Theory, 13, 21, 10.1109/TIT.1967.1053964
Daelemans, 1994, The acquisition of stress: a data-oriented approach, Comput. Linguistics, 20, 421
Devijver, 1982
Dhagat, 1994, PAC learning with irrelevant attributes, 64
Doak, 1992, An evaluation of feature-selection methods and their application to computer security
Drucker, 1992, Improving performance in neural networks using a boosting algorithm, Vol. 4
Drucker, 1994, Boosting and other machine learning algorithms, 53
Dyer, 1989, A random polynomial time algorithm for approximating the volume of convex bodies, 375
Freund, 1990, Boosting a weak learning algorithm by majority, 202
Freund, 1992, An improved boosting algorithm and its implications on learning complexity, 391
Garey, 1979
Gil, 1993, Efficient domain-independent experimentation, 128
Greiner, 1997, Knowing what doesn't matter: exploiting the omission of irrelevant data, Artificial Intelligence, 97, 345, 10.1016/S0004-3702(97)00048-9
Gross, 1991, Concept acquisition through attribute evolution and experiment selection
Haussler, 1986, Quantifying the inductive bias in concept learning, 485
Holte, 1993, Very simple classification rules perform well on most commonly used domains, Machine Learning, 11, 63, 10.1023/A:1022631118932
Jackson, 1994, An efficient membership-query algorithm for learning DNF with respect to the uniform distribution
John, 1994, Irrelevant features and the subset selection problem, 121
John, 1996, Static vs. dynamic sampling for data mining, 367
Jolliffe, 1986
Johnson, 1974, Approximation algorithms for combinatorial problems, J. Comput. System Sci., 9, 256, 10.1016/S0022-0000(74)80044-9
Kearns, 1994
Kira, 1992, A practical approach to feature selection, 249
Kivinen, 1995, Additive versus exponentiated gradient updates for linear prediction, 209
Kivinen, 1997, The Perceptron algorithm versus Winnow: linear versus logarithmic mistake bounds when few input variables are relevant (Technical Note), Artificial Intelligence, 97, 325, 10.1016/S0004-3702(97)00039-8
Kohavi, 1995, The power of decision tables
Kohavi, 1997, Wrappers for feature subset selection, Artificial Intelligence, 97, 273, 10.1016/S0004-3702(97)00043-X
Kohavi, 1997, The utility of feature weighting in nearest-neighbor algorithms
Knobe, 1977, A method for inferring context-free grammars, Inform. and Control, 31, 129, 10.1016/S0019-9958(76)80003-4
Koller, 1996, Toward optimal feature selection
Kononenko, 1994, Estimating attributes: analysis and extensions of RELIEF
Kubat, 1993, Discovering patterns in EEG signals: comparative study of a few methods, 367
Kulkarni, 1990, Experimentation in machine discovery
Langley, 1993, Average-case analysis of a nearest neighbor algorithm, 889
Langley, 1994, Oblivious decision trees and abstract cases, 113
Langley, 1994, Induction of selective Bayesian classifiers, 399
Langley, 1997, Scaling to domains with many irrelevant features, Vol. 4
Lewis, 1992, Representation and learning in information retrieval
Lewis, 1992, Feature selection and feature extraction for text categorization, 212
Lewis, 1994, Heterogeneous uncertainty sampling, 148
Lin, 1992, Self-improving reactive agents based on reinforcement learning, planning and teaching, Machine Learning, 8, 293, 10.1007/BF00992699
Littlestone, 1988, Learning quickly when irrelevant attributes abound: a new linear threshold algorithm, Machine Learning, 2, 285, 10.1007/BF00116827
Littlestone, 1991, On-line learning of linear functions, 465
Littlestone, 1997, An apobayesian relative of winnow, Vol. 9
Littlestone, 1994, The weighted majority algorithm, Inform. and Comput., 108, 212, 10.1006/inco.1994.1009
Lovász, 1992, On the randomized complexity of volume and diameter, 482
Lund, 1993, On the hardness of approximating minimization problems, 286
Matheus, 1989, Constructive induction on decision trees, 645
Michalski, 1980, Pattern recognition as rule-guided inductive inference, IEEE Trans. Pattern Anal. Machine Intell., 2, 349, 10.1109/TPAMI.1980.4767034
Minsky, 1969
Mitchell, 1982, Generalization as search, Artificial Intelligence, 18, 203, 10.1016/0004-3702(82)90040-6
Moore, 1994, Efficient algorithms for minimizing cross validation error, 190
Norton, 1989, Generating better decision trees, 800
Pazzani, 1992, A framework for the average case analysis of conjunctive learning algorithms, Machine Learning, 9, 349, 10.1007/BF00994111
Pagallo, 1990, Boolean feature discovery in empirical learning, Machine Learning, 5, 71, 10.1023/A:1022611825350
Quinlan, 1983, Learning efficient classification procedures and their application to chess end games
Quinlan, 1993
Rajamoney, 1990, A computational approach to theory revision
Rivest, 1993, Inference of finite automata using homing sequences, Inform. and Comput., 103, 299, 10.1006/inco.1993.1021
Rumelhart, 1986, Learning internal representations by error propagation, Vol. 1
Sammut, 1986, Learning concepts by asking questions, Vol. 2
Schapire, 1990, The strength of weak learnability, Machine Learning, 5, 197, 10.1007/BF00116037
Schlimmer, 1993, Efficiently inducing determinations: a complete and efficient search algorithm that uses optimal pruning, 284
Scott, 1991, Representation generation in an exploratory learning system
Seung, 1992, Query by committee, 287
Shen, 1989, Rule creation and rule learning through environmental exploration, 675
Sinclair, 1989, Approximate counting, uniform generation and rapidly mixing Markov chains, Inform. and Comput., 82, 93, 10.1016/0890-5401(89)90067-9
Singh, 1995, A comparison of induction algorithms for selective and non-selective Bayesian classifiers, 497
Singh, 1996, Efficient learning of selective Bayesian network classifiers
Skalak, 1994, Prototype and feature selection by sampling and random mutation hill-climbing algorithms, 293
Stanfill, 1987, Memory-based reasoning applied to English pronunciation, 577
Ting, 1994, Discretization of continuous-valued attributes and instance-based learning
Townsend-Weber, 1994, Instance-based prediction of continuous values, 30
Verbeurgt, 1990, Learning DNF under the uniform distribution in polynomial time, 314
Vere, 1975, Induction of concepts in the predicate calculus, 281
Vovk, 1990, Aggregating strategies, 371
Widrow, 1960, Adaptive switching circuits, 96
Winston, 1975, Learning structural descriptions from examples