Learning from data: a tutorial with emphasis on modern pattern recognition methods
Tóm tắt
The purposes of this tutorial are twofold. First, it reviews the classical statistical learning scenario by highlighting its fundamental taxonomies and its key aspects. The second aim of the paper is to introduce some modern (ensembles) methods developed inside the machine learning field. The tutorial starts by putting the topic of supervised learning into the broader context of data analysis and by reviewing the classical pattern recognition methods: those based on class-conditional density estimation and the use of the Bayes theorem and those based on discriminant functions. The fundamental topic of complexity control is treated in some detail. Ensembles techniques have drawn considerable attention in recent years: a set of learning machines increases classification accuracy with respect to a single machine. Here, we introduce boosting, in which classifiers adaptively concentrate on the harder examples located near to the classification boundary and output coding, where a set of independent two-class machines solves a multiclass problem. The first successful applications of these methods to data produced by the Pico-2 electronic nose (EN), developed at the University of Brescia, Brescia, Italy, are also briefly shown.
Từ khóa
#Tutorial #Pattern recognition #Sensor arrays #Data analysis #Electronic noses #Principal component analysis #Chemical sensors #Machine learning #Neural networks #Sensor phenomena and characterizationTài liệu tham khảo
10.1007/BF00116037
10.1007/3-540-45014-9_1
10.1007/BF00058655
freund, 1996, experiments with a new boosting algorithm, Proc 13th Int Conf Mach Learn, 148
schapire, 1999, a brief introduction to boosting, 16th Int Joint Conf Artificial Intell, 1401
10.1016/S0893-6080(96)00098-6
10.1109/34.58871
cherkauker, 1996, human expert-level performance on a scientific image analysis task by a system using combined artificial neural networks, Working Notes AAAI Workshop Integrating Multiple Learned Models (IMLM 96), 15
10.1162/neco.1995.7.5.867
10.1162/neco.1994.6.2.181
dietterich, 1997, machine learning research: four current directions, Artificial Intell Mag, 18, 97
10.1214/aos/1024691079
10.1142/9789812795885_0025
vapnik, 1998, Statistical Learning Theory
10.1007/978-1-4757-2440-0
dietterich, 1997, Fundamental experimental research in machine learning
breiman, 1984, Classification and Regression Trees
duda, 2000, Pattern Classification and Scene Analysis
merz, 1998, UCI repository of machine learning databases
quinlan, 1993, C4 5 Programs for Machine Learning
10.1214/aos/1024691352
michie, 1997, Machine Learning Neural and Statistical Classification
moreira, 1998, improved pairwise coupling classifiers with correcting classifiers, Tenth Eur Conf Machine Learning
10.1016/S0003-2670(01)00936-9
demuth, 1998, Manual of the Neural Network Toolbox
10.1109/IJCNN.2000.857870
masulli, 2000, effectiveness of error correcting output codes in multiclass learning problems, Multiple Classifier Syst First Int Workshop MCS 2000, 107
10.1109/34.824819
bishop, 1995, Neural Networks for Pattern Recognition
10.1006/inco.1995.1136
10.1063/1.1144830
bishop, 1996, neural networks: a pattern recognition perspective, Handbook of Neural Computation, 10.1201/9781420050646.ptb6
10.1017/CBO9780511812651
haykin, 1999, Neural Networks A Comprehensive Foundation
sarle, 1995, stopped training and other remedies for overfitting, Proc 27th Symp on Interface Comput Sci Statist, 352
10.1162/neco.1992.4.3.415
foresee, 1997, gauss–newton approximation to bayesian regularization, Proc 1991 IEEE Int Joint Conf Neural Networks, 1930
kohonen, 1997, Self-Organizing Maps, 10.1007/978-3-642-97966-8
pardo, 2001, boosting electronic noses, Proc 8th Int Symp Olfaction Electron Nose
pardo, 2000, a general framework for learning from data and an application to three electronic nose datasets, 7th Int Symp Olfaction Electron Nose
webb, 1999, Statistical Pattern Recognition
cherkassky, 1998, Learning From Data
tukey, 1977, Exploratory Data Analysis
10.2307/2981969
dietterich, 1995, solving multiclass learning problems via error-correcting output codes, J Artif Intell Res, 263, 10.1613/jair.105
fukunaga, 1990, Introduction to statistical pattern recognition
10.1023/A:1007607513941
opitz, 1999, popular ensemble methods: an empirical study, Artif Intell J, 11, 169, 10.1613/jair.614
10.1109/72.363444
10.1145/307400.307419
freund, 0, Slides on boosting
10.1006/jcss.1997.1504
schapire, 1997, using output codes to boost multiclass learning problems, Proc Fourteenth Int Conf Machine Learning, 313
quinlan, 1996, bagging, boosting and c4.5, Proc Thirteenth Nat Conf Artificial Intell, 725