Learning from Imbalanced Data
Tóm tắt
Từ khóa
Tài liệu tham khảo
drummond, 2003, C4.5, Class Imbalance, and Cost Sensitivity: Why Under Sampling Beats Over-Sampling, Proc Int'l Conf Machine Learning Workshop Learning from Imbalanced Data Sets II
mease, 2007, Boosted Classification Trees and Class Probability/Quantile Estimation, J Machine Learning Research, 8, 409
chawla, 2003, C4.5 and Imbalanced Data Sets: Investigating the Effect of Sampling Method, Probabilistic Estimate, and Decision Tree Structure, Proc Int'l Conf Machine Learning Workshop Learning from Imbalanced Data Sets II
caruana, 2000, Learning from Imbalanced Data: Rank Metrics and Extra Tasks, Proc Am Assoc for Artificial Intelligence (AAAI) Conf, 51
laurikkala, 2001, Improving Identification of Difficult Small Classes by Balancing Class Distribution, Proc Conf AI in Medicine in Europe Artificial Intelligence Medicine, 63, 10.1007/3-540-48229-6_9
weiss, 2001, The Effect of Class Distribution on Classifier Learning: An Empirical Study
mitchell, 1997, Machine Learning
japkowicz, 2003, Class Imbalances: Are We Focusing on the Right Issue?, Proc Int'l Conf Machine Learning Workshop Learning from Imbalanced Data Sets II
prati, 2004, Class Imbalances versus Class Overlapping: An Analysis of a Learning System Behavior, Proc Mexican Conf Artif Intell, 312
weiss, 2003, Learning When Training Data Are Costly: The Effect of Class Distribution on Tree Induction, J Artificial Intelligence Research, 19, 315, 10.1613/jair.1199
japkowicz, 2002, The Class Imbalance Problem: A Systematic Study, Intelligent Data Analysis, 6, 429, 10.3233/IDA-2002-6504
2005, Fast Kernel Classifiers with Online and Active Learning, J Machine Learning Research, 6, 1579
holte, 1989, Concept Learning and the Problem of Small Disjuncts, Proc Int’l Conf Artificial Intelligence, 813
provost, 2000, Machine Learning from Imbalanced Data Sets 101, Proc Learning from Imbalanced Data Sets Papers from the Am Assoc for Artificial Intelligence Workshop
maloof, 2003, Learning When Data Sets Are Imbalanced and When Costs Are Unequal and Unknown, Proc Int'l Conf Machine Learning Workshop Learning from Imbalanced Data Sets II
fan, 1999, AdaCost: Misclassification Cost-Sensitive Boosting, Proc Int’l Conf Machine Learning, 97
freund, 1996, Experiments with a New Boosting Algorithm, Proc Int’l Conf Machine Learning, 148
liu, 2006, The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study, Proc Int’l Conf Data Mining, 970
liu, 2006, Exploratory Under Sampling for Class Imbalance Learning, Proc Int’l Conf Data Mining, 965
he, 2007, A Ranked Subspace Learning Method for Gene Expression Data Classification, Proc Int’l Conf Artificial Intelligence, 358
pearson, 2003, Imbalanced Clustering for Microarray Time-Series, Proc Int'l Conf Machine Learning Workshop Learning from Imbalanced Data Sets II
elkan, 2001, The Foundations of Cost-Sensitive Learning, Proc Int Joint Artif Intell Conf, 973
sun, 2006, Boosting for Learning Multiple Classes with Imbalanced Class Distribution, Proc Int’l Conf Data Mining, 592
chen, 2006, Efficient Classification of Multi-Label and Imbalanced Data Using Min-Max Modular Classifiers, Proc World Congress on Computation Intelligence—Int’l Joint Conf Neural Networks, 1770
tomek, 1976, Two Modifications of CNN, IEEE Trans System Man Cybernetics, 6, 769, 10.1109/TSMC.1976.4309452
he, 2008, ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning, Proc Int’l J Conf Neural Networks, 1322
chawla, 2003, SMOTEBoost: Improving Prediction of the Minority Class in Boosting, Proc Seventh European Conf Principles and Practice of Knowledge Discovery in Databases, 107
kubat, 1997, Addressing the Curse of Imbalanced Training Sets: One-Sided Selection, Proc Int’l Conf Machine Learning, 179
zhang, 2003, KNN Approach to Unbalanced Data Distributions: A Case Study Involving Information Extraction, Proc Int’l Conf Machine Learning (ICML ’2003) Workshop Learning from Imbalanced Data Sets
han, 2005, Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning, Proc Int’l Conf Intelligent Computing, 878
wang, 2004, Imbalanced Data Set Learning with Synthetic Samples, Proc IRIS Machine Learning Workshop
singla, 2005, Discriminative Training of Markov Logic Networks, Proc Int’l Conf Artificial Intelligence, 868
davis, 2005, View Learning for Statistical Relational Learning: With an Application to Mammography, Proc Int Joint Artif Intell Conf, 677
platt, 1999, Fast Training of Support Vector Machines Using Sequential Minimal Optimization, Advances in Kernel Methods Support Vector Learning, 185
fumera, 2002, Support Vector Machines with Embedded Reject Option, Proc Int'l Workshop Pattern Recognition with Support Vector Machines, 68, 10.1007/3-540-45665-1_6
holte, 2006, Cost Curves: An Improved Method for Visualizing Classifier Performance, Machine Learning, 65, 95, 10.1007/s10994-006-8199-5
wu, 2003, Class-Boundary Alignment for Imbalanced Data Set Learning, Proc Int’l Conf Data Mining (ICDM ’03) Workshop Learning from Imbalanced Data Sets II
holte, 2000, Explicitly Representing Expected Cost: An Alternative to ROC Representation, Proc Int'l Conf Knowledge Discovery and Data Mining, 198
akbani, 2004, Applying Support Vector Machines to Imbalanced Data Sets, Lecture Notes in Computer Science, 3201, 39, 10.1007/978-3-540-30115-8_7
2009, NIST Scientific and Technical Databases
he, 2008, IMORL: Incremental Multiple Objects Recognition Localization, IEEE Trans Neural Networks, 19, 1727, 10.1109/TNN.2008.2001774
kang, 2006, EUS SVMs: Ensemble of Under sampled SVMs for Data Imbalance Problems, Lecture Notes in Computer Science, 4232, 837, 10.1007/11893028_93
liu, 2006, Boosting Prediction Accuracy on Imbalanced Data Sets with SVM Ensembles, Lecture Notes in Artificial Intelligence, 3918, 107
2009, UC Irvine Machine Learning Repository
zhu, 2007, Semi-Supervised Learning Literature Survey
mitchell, 1999, The Role of Unlabeled Data in Supervised Learning, Proc Int Colloquium on Cognitive Science
ting, 2000, A Comparative Study of Cost-Sensitive Boosting Algorithms, Proc Int’l Conf Machine Learning, 983
breiman, 1984, Classification and Regression Trees
maloof, 1997, Learning to Detect Rooftops in Aerial Images, Proc Image Understanding Workshop, 835
drummond, 2000, Exploiting the Cost (In)Sensitivity of Decision Tree Splitting Criteria, Proc Int’l Conf Machine Learning, 239
haykin, 1999, Neural Networks A Comprehensive Foundation
kukar, 1998, Cost-Sensitive Learning with Neural Networks, Proc European Conf Artificial Intelligence, 445
bennett, 1998, Semi-Supervised Support Vector Machines, Proc Conf Neural Information Processing Systems, 368
domingos, 1996, Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier, Proc Int’l Conf Machine Learning, 105
blum, 2001, Learning from Labeled and Unlabeled Data Using Graph Mincuts, Proc Int’l Conf Machine Learning, 19
kohavi, 1996, Bias Plus Variance Decomposition for Zero-One Loss Functions, Proc Int’l Conf Machine Learning
zhou, 2004, Semi-Supervised Learning on Directed Graphs, Proc Conf Neural Information Processing Systems, 1633
chawla, 2003, Workshop Learning from Imbalanced Data Sets II, Proc Int’l Conf Machine Learning
fujino, 2005, A Hybrid Generative/Discriminative Approach to Semi-Supervised Classifier Design, Proc Int’l Conf Artificial Intelligence, 764
japkowicz, 2000, Learning from Imbalanced Data Sets, Proc Am Assoc for Artificial Intelligence (AAAI) Workshop
miller, 1996, A Mixture of Experts Classifier with Learning Based on Both Labeled and Unlabelled Data, Proc Ann Conf Neural Information Processing Systems, 571
li, 2006, Hybrid Kernel Machine Ensemble for Imbalanced Data Sets, Proc Int’l Conf Pattern Recognition, 1108
zhuang, 2006, Parameter Optimization of Kernel-Based One-Class Classifier on Imbalance Text Learning, Lecture Notes in Artificial Intelligence, 4099, 434
manevitz, 2001, One-Class SVMs for Document Classification, J Machine Learning Research, 2, 139
liu, 2005, Total Margin Based Adaptive Fuzzy Support Vector Machines for Multiview Face Recognition, Proc Int Conf Systems Man and Cybernetics, 1704
doucette, 2008, GP Classification under Imbalanced Data Sets: Active Sub-Sampling AUC Approximation, Lecture Notes in Computer Science, 4971, 266, 10.1007/978-3-540-78671-9_23
zhu, 2007, Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem, Proc Joint Conf Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 783
japkowicz, 2000, Learning from Imbalanced Data Sets: A Comparison of Various Strategies, Proc Am Assoc for Artificial Intelligence (AAAI) Workshop Learning from Imbalanced Data Sets, 10
japkowicz, 1995, A Novelty Detection Approach to Classification, Proc Joint Conf Artificial Intelligence, 518
ertekin, 2007, Learning on the Border: Active Learning in Imbalanced Data Classification, Proc ACM Conf Information and Knowledge Management, 127
abe, 2003, Invited Talk: Sampling Approaches to Learning from Imbalanced Data Sets: Active Learning, Cost Sensitive Learning and Deyond, Proc Int'l Conf Machine Learning Workshop Learning from Imbalanced Data Sets II
zhou, 2006, On Multi-Class Cost-Sensitive Learning, Proc Int’l Conf Artificial Intelligence, 567
liu, 2006, Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem, IEEE Trans Knowledge and Data Eng, 18, 63, 10.1109/TKDE.2006.17
tan, 2003, Multi-Class Protein Fold Classification Using a New Ensemble Machine Learning Approach, Genome Informatics, 14, 206
chawla, 2002, SMOTE: Synthetic Minority Over-Sampling Technique, J Artificial Intelligence Research, 16, 321, 10.1613/jair.953
provost, 1998, The Case against Accuracy Estimation for Comparing Induction Algorithms, Proc Int’l Conf Machine Learning, 445
tang, 2006, Granular SVM with Repetitive Undersampling for Highly Imbalanced Protein Homology Prediction, Proc Int’l Conf Granular Computing, 457
provost, 1997, Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions, Proc Int'l Conf Knowledge Discovery and Data Mining, 43
tang, 2005, Granular SVM-RFE Feature Selection Algorithm for Reliable Cancer-Related Gene Subsets Extraction on Microarray Gene Expression Data, Proc 2nd IEEE Bioinformatics Bioeng Symp, 290
clifton, 2004, Minority Report in Fraud Detection: Classification of Skewed Data, ACM SIGKDD Explorations Newsletter, 6, 50, 10.1145/1007730.1007738
tang, 2005, Granular Support Vector Machines Using Linear Decision Hyperplanes for Fast Medical Binary Classification, Proc Int’l Conf Fuzzy Systems, 138, 10.1109/FUZZY.2005.1452382
fawcett, 2003, ROC Graphs: Notes and Practical Considerations for Data Mining Researchers
chan, 1998, Toward Scalable Learning with Non-Uniform Class and Cost Distributions, Proc Int'l Conf Knowledge Discovery and Data Mining, 164
taguchi, 2001, The Mahalanobis-Taguchi System
provost, 2000, Well-Trained Pets: Improving Probability Estimation Trees
wu, 2004, Aligning Boundary in Kernel Space for Learning Imbalanced Data Set, Proc Int’l Conf Data Mining, 265
wu, 2003, Adaptive Feature-Space Conformal Transformation for Imbalanced-Data Learning, Proc Int’l Conf Machine Learning, 816