Learning from class-imbalanced data: Review of methods and applications
Tóm tắt
Từ khóa
Tài liệu tham khảo
Abbasi, 2009, A comparison of fraud cues and classification methods for fake escrow website detection, Information Technology and Management, 10, 83, 10.1007/s10799-009-0059-0
Abeysinghe, 2016, A Classifier Hub for Imbalanced Financial Data
Al-Ghraibah, 2015, A Study of Feature Selection of Magnetogram Complexity Features in an Imbalanced Solar Flare Prediction Data-set
Alfaro, 2008, Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks, Decision Support Systems, 45, 110, 10.1016/j.dss.2007.12.002
Ali, 2016, Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data, Computers in biology and medicine, 73, 38, 10.1016/j.compbiomed.2016.04.002
Alibeigi, 2012, DBFS: An effective Density Based Feature Selection scheme for small sample size and high dimensional imbalanced data sets, Data & Knowledge Engineering, 81, 67, 10.1016/j.datak.2012.08.001
Alshomrani, 2015, A proposal for evolutionary fuzzy systems using feature weighting: Dealing with overlapping in imbalanced datasets, Knowledge-Based Systems, 73, 1, 10.1016/j.knosys.2014.09.002
Alsulaiman, 2012, Identity verification based on haptic handwritten signatures: Genetic programming with unbalanced data
Anand, 2010, An approach for classification of highly imbalanced data using weighting and undersampling, Amino acids, 39, 1385, 10.1007/s00726-010-0595-2
Anderson, 2012, Governing events and life:‘Emergency'in UK Civil Contingencies, Political Geography, 31, 24, 10.1016/j.polgeo.2011.09.002
Ando, 2015, Classifying imbalanced data in distance-based feature space, Knowledge and Information Systems, 1
Ashkezari, 2013, Application of fuzzy support vector machine for determining the health index of the insulation system of in-service power transformers, Dielectrics and Electrical Insulation, IEEE Transactions on, 20, 965, 10.1109/TDEI.2013.6518966
Azaria, 2014, Behavioral Analysis of Insider Threat: A Survey and Bootstrapped Prediction in Imbalanced Data, Computational Social Systems, IEEE Transactions on, 1, 135, 10.1109/TCSS.2014.2377811
Bae, 2015, Polyp Detection via Imbalanced Learning and Discriminative Feature Learning, Medical Imaging, IEEE Transactions on, 34, 2379, 10.1109/TMI.2015.2434398
Bagherpour, 2016, FIR as Classifier in the Presence of Imbalanced Data
Bahnsen, 2013, Cost sensitive credit card fraud detection using Bayes minimum risk
Bao, 2016, ACID: association correction for imbalanced data in GWAS, IEEE/ACM Transactions on Computational Biology and Bioinformatics
Bao, 2016, Boosted Near-miss Under-sampling on SVM ensembles for concept detection in large-scale imbalanced datasets, Neurocomputing, 172, 198, 10.1016/j.neucom.2014.05.096
Beyan, 2015, Classifying imbalanced data sets using similarity based hierarchical decomposition, Pattern Recognition, 48, 1653, 10.1016/j.patcog.2014.10.032
Blagus, 2013, SMOTE for high-dimensional class-imbalanced data, BMC bioinformatics, 14, 1
Błaszczyński, 2016, Diversity Analysis on Imbalanced Data Using Neighbourhood and Roughly Balanced Bagging Ensembles
Bogina, 2016, Learning Item Temporal Dynamics for Predicting Buying Sessions
Boyu Wang, 2016, Online Bagging and Boosting for Imbalanced Data Streams, IEEE Transactions on Knowledge and Data Engineering, 28, 3353, 10.1109/TKDE.2016.2609424
Branco, 2016, A Survey of Predictive Modeling on Imbalanced Domains, ACM Computing Surveys (CSUR), 49, 10.1145/2907070
Braytee, 2016, A Cost-Sensitive Learning Strategy for Feature Extraction from Imbalanced Data
Brekke, 2008, Classifiers and confidence estimation for oil spill detection in ENVISAT ASAR images, Geoscience and Remote Sensing Letters, IEEE, 5, 65, 10.1109/LGRS.2007.907174
Bria, 2012, A ranking-based cascade approach for unbalanced data
Brown, 2012, An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Systems with Applications, 39, 3446, 10.1016/j.eswa.2011.09.033
Cao, 2013, Integrated oversampling for imbalanced time series classification, Knowledge and Data Engineering, IEEE Transactions on, 25, 2809, 10.1109/TKDE.2013.37
Cao, 2014, A parsimonious mixture of Gaussian trees model for oversampling in imbalanced and multimodal time-series classification, Neural Networks and Learning Systems, IEEE Transactions on, 25, 2226, 10.1109/TNNLS.2014.2308321
Cao, 2002, Projective ART for clustering data sets in high dimensional spaces, Neural Networks, 15, 105, 10.1016/S0893-6080(01)00108-3
Casañola-Martin, 2016, Exploring different strategies for imbalanced ADME data problem: case study on Caco-2 permeability modeling, Molecular diversity, 20, 93, 10.1007/s11030-015-9649-4
Castro, 2013, Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data, Neural Networks and Learning Systems, IEEE Transactions on, 24, 888, 10.1109/TNNLS.2013.2246188
Cateni, 2014, A method for resampling imbalanced datasets in binary classification tasks for real-world problems, Neurocomputing, 135, 32, 10.1016/j.neucom.2013.05.059
Cerf, 2013, Parameter-free classification in multi-class imbalanced data sets, Data & Knowledge Engineering, 87, 109, 10.1016/j.datak.2013.06.001
Chang, 2012, A cost-effective method for early fraud detection in online auctions
Chawla, 2002, SMOTE: synthetic minority over-sampling technique, Journal of artificial intelligence research, 321, 10.1613/jair.953
Chen, 2006, Efficient classification of multi-label and imbalanced data using min-max modular classifiers
Chen, 2010, RAMOBoost: ranked minority oversampling in boosting, Neural Networks, IEEE Transactions on, 21, 1624, 10.1109/TNN.2010.2066988
Chen, 2008, Fast: a roc-based feature selection metric for small samples and imbalanced data classification problems
Chen, 2016, An empirical study of a hybrid imbalanced-class DT-RST classification procedure to elucidate therapeutic effects in uremia patients, Medical & biological engineering & computing, 54, 983, 10.1007/s11517-016-1482-0
Chen, 2012, A hierarchical multiple kernel support vector machine for customer churn prediction using longitudinal behavioral data, European Journal of Operational Research, 223, 461, 10.1016/j.ejor.2012.06.040
Cheng, 2016, Cost-Sensitive Large margin Distribution Machine for classification of imbalanced data, Pattern Recognition Letters, 80, 107, 10.1016/j.patrec.2016.06.009
Cheng, 2015, Affective detection based on an imbalanced fuzzy support vector machine, Biomedical Signal Processing and Control, 18, 118, 10.1016/j.bspc.2014.12.006
Cheng, 2009, A data-driven approach to manage the length of stay for appendectomy patients, Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, 39, 1339, 10.1109/TSMCA.2009.2025510
Chetchotsak, 2015, Integrating new data balancing technique with committee networks for imbalanced data: GRSOM approach, Cognitive neurodynamics, 9, 627, 10.1007/s11571-015-9350-4
D'Este, 2014, Ensemble aggregation methods for relocating models of rare events, Engineering Applications of Artificial Intelligence, 34, 58, 10.1016/j.engappai.2014.05.007
D'Addabbo, 2015, Parallel selective sampling method for imbalanced and large data classification, Pattern Recognition Letters, 62, 61, 10.1016/j.patrec.2015.05.008
da Silva, 2011, PCA and Gaussian noise in MLP neural network training improve generalization in problems with small and unbalanced data sets
Dai, 2015, Imbalanced Protein Data Classification Using Ensemble FTM-SVM, NanoBioscience, IEEE Transactions on, 14, 350, 10.1109/TNB.2015.2431292
Dal Pozzolo, 2015, Credit card fraud detection and concept-drift adaptation with delayed supervised information
Das, 2015, RACOG and wRACOG: Two Probabilistic Oversampling Techniques, Knowledge and Data Engineering, IEEE Transactions on, 27, 222, 10.1109/TKDE.2014.2324567
Datta, 2015, Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs, Neural Networks, 70, 39, 10.1016/j.neunet.2015.06.005
de Souza, 2016, Recent advances for handling imbalancement and uncertainty in labelling in medicinal chemistry data analysis
del Río, 2014, On the use of MapReduce for imbalanced big data using random forest, Information Sciences, 285, 112, 10.1016/j.ins.2014.03.043
Denil, 2010, Overlap versus Imbalance
Díez-Pastor, 2015, Random balance: ensembles of variable priors classifiers for imbalanced data, Knowledge-Based Systems, 85, 96, 10.1016/j.knosys.2015.04.022
Díez-Pastor, 2015, Diversity techniques improve the performance of the best imbalance learning ensembles, Information Sciences, 325, 98, 10.1016/j.ins.2015.07.025
Ditzler, 2013, Incremental learning of concept drift from streaming imbalanced data, Knowledge and Data Engineering, IEEE Transactions on, 25, 2283, 10.1109/TKDE.2012.136
Dong, 2016, Semi-supervised classification method through oversampling and common hidden space, Information Sciences, 349, 216, 10.1016/j.ins.2016.02.042
Drown, 2009, Evolutionary sampling and software quality modeling of high-assurance systems, Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, 39, 1097, 10.1109/TSMCA.2009.2020804
Duan, 2016, A new support vector data description method for machinery fault diagnosis with unbalanced datasets, Expert Systems with Applications, 64, 239, 10.1016/j.eswa.2016.07.039
Duan, 2016, Support vector data description for machinery multi-fault classification with unbalanced datasets
Dubey, 2014, Analysis of sampling techniques for imbalanced data: An n= 648 ADNI study, NeuroImage, 87, 220, 10.1016/j.neuroimage.2013.10.005
Engen, 2008, Enhancing network based intrusion detection for imbalanced data, International Journal of Knowledge-Based and Intelligent Engineering Systems, 12, 357
Escudeiro, 2012, D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions, Journal of the Brazilian Computer Society, 18, 311, 10.1007/s13173-012-0069-3
Fabris, 2009, Novel approaches for detecting frauds in energy consumption
Fahimnia, 2015, Quantitative models for managing supply chain risks: A review, European Journal of Operational Research, 247, 1, 10.1016/j.ejor.2015.04.034
Fan, 2016, Probability Model Selection and Parameter Evolutionary Estimation for Clustering Imbalanced Data without Sampling, Neurocomputing, 10.1016/j.neucom.2015.10.140
Farvaresh, 2011, A data mining framework for detecting subscription fraud in telecommunication, Engineering Applications of Artificial Intelligence, 24, 182, 10.1016/j.engappai.2010.05.009
Fernández, 2010, Multi-class imbalanced data-sets with linguistic fuzzy rule based classification systems based on pairwise learning, 89
Fernández, 2010, On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets, Information Sciences, 180, 1268, 10.1016/j.ins.2009.12.014
Fernández, 2013, Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches, Knowledge-Based Systems, 42, 97, 10.1016/j.knosys.2013.01.018
Ferri, 2011, A coherent interpretation of AUC as a measure of aggregated classification performance
Folino, 2016, An Incremental Ensemble Evolved by using Genetic Programming to Efficiently Detect Drifts in Cyber Security Datasets
Frasca, 2013, A neural network algorithm for semi-supervised node label learning from unbalanced data, Neural Networks, 43, 84, 10.1016/j.neunet.2013.01.021
Freund, 1996, Experiments with a new boosting algorithm
Freund, 1997, A decision-theoretic generalization of on-line learning and an application to boosting, Journal of computer and system sciences, 55, 119, 10.1006/jcss.1997.1504
Friedman, 2001, Greedy function approximation: a gradient boosting machine, Annals of statistics, 1189
Fu, 2013, Certainty-based active learning for sampling imbalanced datasets, Neurocomputing, 119, 350, 10.1016/j.neucom.2013.03.023
Galar, 2012, A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches, Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 42, 463, 10.1109/TSMCC.2011.2161285
Galar, 2013, EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling, Pattern Recognition, 46, 3460, 10.1016/j.patcog.2013.05.006
Gao, 2014, Construction of neurofuzzy models for imbalanced data classification, Fuzzy Systems, IEEE Transactions on, 22, 1472, 10.1109/TFUZZ.2013.2296091
Gao, 2016, Adaptive weighted imbalance learning with application to abnormal activity recognition, Neurocomputing, 173, 1927, 10.1016/j.neucom.2015.09.064
García, 2012, Surrounding neighborhood-based SMOTE for learning from imbalanced data sets, Progress in Artificial Intelligence, 1, 347, 10.1007/s13748-012-0027-5
Garcia-Pedrajas, 2015, A Proposal for Local k Values for k-Nearest Neighbor Rule, IEEE transactions on neural networks and learning systems
García-Pedrajas, 2013, Boosting for class-imbalanced datasets using genetically evolved supervised non-linear projections, Progress in Artificial Intelligence, 2, 29, 10.1007/s13748-012-0028-4
Ghazikhani, 2013, Ensemble of online neural networks for non-stationary and imbalanced data streams, Neurocomputing, 122, 535, 10.1016/j.neucom.2013.05.003
Ghazikhani, 2013, Online cost-sensitive neural network classifiers for non-stationary and imbalanced data streams, Neural Computing and Applications, 23, 1283, 10.1007/s00521-012-1071-6
Ghazikhani, 2014, Online neural network model for non-stationary and imbalanced data stream classification, International Journal of Machine Learning and Cybernetics, 5, 51, 10.1007/s13042-013-0180-6
Gong, 2012, A Kolmogorov–Smirnov statistic based segmentation approach to learning from imbalanced datasets: With application in property refinance prediction, Expert Systems with Applications, 39, 6192, 10.1016/j.eswa.2011.12.011
Govindan, 2016, ELECTRE: A comprehensive literature review on methodologies and applications, European Journal of Operational Research, 250, 1, 10.1016/j.ejor.2015.07.019
Gu, 2009, Evaluation measures of the classification performance of imbalanced data sets
Guo, 2016, BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification, Engineering Applications of Artificial Intelligence, 49, 176, 10.1016/j.engappai.2015.09.011
Guyon, 2003, An introduction to variable and feature selection, The Journal of Machine Learning Research, 3, 1157
Ha, 2016, A New Under-Sampling Method Using Genetic Algorithm for Imbalanced Data Classification
Hajian, 2011, Discrimination prevention in data mining for intrusion and crime detection
Hand, 2009, Measuring classifier performance: a coherent alternative to the area under the ROC curve, Machine learning, 77, 103, 10.1007/s10994-009-5119-5
Hand, 2001, A simple generalisation of the area under the ROC curve for multiple class classification problems, Machine learning, 45, 171, 10.1023/A:1010920819831
Hao, 2014, An efficient algorithm coupled with synthetic minority over-sampling technique to classify imbalanced PubChem BioAssay data, Analytica chimica acta, 806, 117, 10.1016/j.aca.2013.10.050
Hartmann, 2004, Dimension reduction vs. variable selection. Applied Parallel Computing, 931
Hassan, 2016, Modeling insurance fraud detection using imbalanced data classification, 117
He, 2009, Learning from imbalanced data, Knowledge and Data Engineering, IEEE Transactions on, 21, 1263, 10.1109/TKDE.2008.239
Herndon, 2016, A Study of Domain Adaptation Classifiers Derived From Logistic Regression for the Task of Splice Site Prediction, IEEE transactions on nanobioscience, 15, 75, 10.1109/TNB.2016.2522400
Hilas, 2008, An application of supervised and unsupervised learning approaches to telecommunications fraud detection, Knowledge-Based Systems, 21, 721, 10.1016/j.knosys.2008.03.026
Hoens, 2012, Learning from streaming data with concept drift and imbalance: an overview, Progress in Artificial Intelligence, 1, 89, 10.1007/s13748-011-0008-0
Hong, 2007, A kernel-based two-class classifier for imbalanced data sets, Neural Networks, IEEE Transactions on, 18, 28, 10.1109/TNN.2006.882812
Hu, 2009, MSMOTE: improving classification performance when training data is imbalanced
Huang, 2006, Extreme learning machine: theory and applications, Neurocomputing, 70, 489, 10.1016/j.neucom.2005.12.126
Huang, 2006, Imbalanced learning with a biased minimax probability machine, Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 36, 913, 10.1109/TSMCB.2006.870610
Huang, 2016, Cost-sensitive sparse linear regression for crowd counting with imbalanced training data
Jacques, 2015, Conception of a dominance-based multi-objective local search in the context of classification rule mining in large and imbalanced data sets, Applied Soft Computing, 34, 705, 10.1016/j.asoc.2015.06.002
Jeni, 2013, Facing Imbalanced Data–Recommendations for the Use of Performance Metrics
Jian, 2016, A new sampling method for classifying imbalanced data based on support vector machine ensemble, Neurocomputing, 193, 115, 10.1016/j.neucom.2016.02.006
Jin, 2014, Weighted local and global regressive mapping: A new manifold learning method for machine fault classification, Engineering Applications of Artificial Intelligence, 30, 118, 10.1016/j.engappai.2014.01.014
Jo, 2004, Class imbalances versus small disjuncts, ACM SIGKDD Explorations Newsletter, 6, 40, 10.1145/1007730.1007737
Kim, 2012, Classification cost: An empirical comparison among traditional classifier, Cost-Sensitive Classifier, and MetaCost, Expert Systems with Applications, 39, 4013, 10.1016/j.eswa.2011.09.071
Kim, 2016, Ordinal Classification of Imbalanced Data with Application in Emergency and Disaster Information Services, IEEE Intelligent Systems, 31, 50, 10.1109/MIS.2016.27
King, 2001, Logistic regression in rare events data, Political analysis, 9, 137, 10.1093/oxfordjournals.pan.a004868
Kirlidog, 2012, A fraud detection approach with data mining in health insurance, Procedia-Social and Behavioral Sciences, 62, 989, 10.1016/j.sbspro.2012.09.168
Krawczyk, 2016, Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy, Applied Soft Computing, 38, 714, 10.1016/j.asoc.2015.08.060
Krawczyk, 2013, An improved ensemble approach for imbalanced classification problems
Krawczyk, 2014, Cost-sensitive decision tree ensembles for effective imbalanced classification, Applied Soft Computing, 14, 554, 10.1016/j.asoc.2013.08.014
Krivko, 2010, A hybrid model for plastic card fraud detection systems, Expert Systems with Applications, 37, 6070, 10.1016/j.eswa.2010.02.119
Kumar, 2014, Undersampled K-means approach for handling imbalanced distributed data, Progress in Artificial Intelligence, 3, 29, 10.1007/s13748-014-0045-6
Kwak, 2015, An Incremental Clustering-Based Fault Detection Algorithm for Class-Imbalanced Process Data, Semiconductor Manufacturing, IEEE Transactions on, 28, 318, 10.1109/TSM.2015.2445380
Lan, 2009, A joint investigation of misclassification treatments and imbalanced datasets on neural network performance, Neural Computing and Applications, 18, 689, 10.1007/s00521-009-0239-1
Lane, 2012, On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data, Decision Support Systems, 53, 712, 10.1016/j.dss.2012.05.028
Lerner, 2007, On the classification of a small imbalanced cytogenetic image database, Computational Biology and Bioinformatics, IEEE/ACM Transactions on, 4, 204, 10.1109/TCBB.2007.070207
Lessmann, 2009, A reference model for customer-centric data mining with support vector machines, European Journal of Operational Research, 199, 520, 10.1016/j.ejor.2008.12.017
Li, 2015, Financial fraud detection by using Grammar-based multi-objective genetic programming with ensemble learning
Li, 2015, Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms, The Journal of Supercomputing, 1
Li, 2016, Adaptive Swarm Balancing Algorithms for rare-event prediction in imbalanced healthcare data, Computerized Medical Imaging and Graphics, 10.1016/j.compmedimag.2016.05.001
Li, 2014, Boosting weighted ELM for imbalanced learning, Neurocomputing, 128, 15, 10.1016/j.neucom.2013.05.051
Li, 2009, Protein-protein interaction extraction from biomedical literatures based on modified SVM-KNN
Li, 2013, Constructing support vector machine ensemble with segmentation for imbalanced datasets, Neural Computing and Applications, 22, 249, 10.1007/s00521-012-1041-z
Li, 2016, An Imbalanced Learning based MDR-TB Early Warning System, Journal of medical systems, 40, 1, 10.1007/s10916-016-0517-2
Li, 2013, Classification of tongue coating using Gabor and Tamura features on unbalanced data set
Li, 2016, Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data, Knowledge-Based Systems, 94, 88, 10.1016/j.knosys.2016.09.014
Liang, 2012, The-Means-Type Algorithms Versus Imbalanced Data Distributions, Fuzzy Systems, IEEE Transactions on, 20, 728, 10.1109/TFUZZ.2011.2182354
Liao, 2008, Classification of weld flaws with imbalanced class data, Expert Systems with Applications, 35, 1041, 10.1016/j.eswa.2007.08.044
Lima, 2015, A Fraud Detection Model Based on Feature Selection and Undersampling Applied to Web Payment Systems
Lin, 2013, Dynamic sampling approach to training neural networks for multiclass imbalance classification, Neural Networks and Learning Systems, IEEE Transactions on, 24, 647, 10.1109/TNNLS.2012.2228231
Lin, 2013, Multiple extreme learning machines for a two-class imbalance corporate life cycle prediction, Knowledge-Based Systems, 39, 214, 10.1016/j.knosys.2012.11.003
Liu, 2014, Risk scoring for prediction of acute cardiac complications from imbalanced clinical data, Biomedical and Health Informatics, IEEE Journal of, 18, 1894, 10.1109/JBHI.2014.2303481
Liu, 2009, Exploratory undersampling for class-imbalance learning, Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 39, 539, 10.1109/TSMCB.2008.2007853
López, 2015, Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data, Fuzzy Sets and Systems, 258, 5, 10.1016/j.fss.2014.01.015
López, 2013, An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics, Information Sciences, 250, 113, 10.1016/j.ins.2013.07.007
López, 2012, Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics, Expert Systems with Applications, 39, 6585, 10.1016/j.eswa.2011.12.043
Loyola-González, 2016, Study of the impact of resampling methods for contrast pattern based classifiers in imbalanced databases, Neurocomputing, 175, 935, 10.1016/j.neucom.2015.04.120
Lu, 2016, A Classification Method of Imbalanced Data Base on PSO Algorithm
Lu, 2008, Ground-level ozone prediction by support vector machine approach with a cost-sensitive classification scheme, Science of the Total Environment, 395, 109, 10.1016/j.scitotenv.2008.01.035
Lusa, 2010, Class prediction for high-dimensional class-imbalanced data, BMC bioinformatics, 11, 523, 10.1186/1471-2105-11-523
Lusa, 2016, Gradient boosting for high-dimensional prediction of rare events, Computational Statistics & Data Analysis
Maalouf, 2014, Weighted logistic regression for large-scale imbalanced and rare events data, Knowledge-Based Systems, 59, 142, 10.1016/j.knosys.2014.01.012
Maalouf, 2011, Robust weighted kernel logistic regression in imbalanced and rare events data, Computational Statistics & Data Analysis, 55, 168, 10.1016/j.csda.2010.06.014
Maldonado, 2014, Imbalanced data classification using second-order cone programming support vector machines, Pattern Recognition, 47, 2070, 10.1016/j.patcog.2013.11.021
Maldonado, 2014, Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines, Information Sciences, 286, 228, 10.1016/j.ins.2014.07.015
Mandadi, 2013, Unusual event detection using sparse spatio-temporal features and bag of words model
Mao, 2017, Online sequential prediction of bearings imbalanced fault diagnosis by extreme learning machine, Mechanical Systems and Signal Processing, 83, 450, 10.1016/j.ymssp.2016.06.024
Mao, 2016, Two-Stage Hybrid Extreme Learning Machine for Sequential Imbalanced Data, Volume 1, 423
Maratea, 2014, Adjusted F-measure and kernel scaling for imbalanced data learning, Information Sciences, 257, 331, 10.1016/j.ins.2013.04.016
Mardani, 2013, A new method for occupational fraud detection in process aware information systems
Márquez-Vera, 2013, Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data, Applied intelligence, 38, 315, 10.1007/s10489-012-0374-8
Maurya, 2015, Online anomaly detection via class-imbalance learning
Maurya, 2016, Online sparse class imbalance learning on big data, Neurocomputing, 10.1016/j.neucom.2016.07.040
Menardi, 2014, Training and assessing classification rules with imbalanced data, Data Mining and Knowledge Discovery, 28, 92, 10.1007/s10618-012-0295-5
Mikolov, T., K. Chen, G. Corrado and J. Dean (2013). "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781.
Mirza, 2015, Voting based weighted online sequential extreme learning machine for imbalance multi-class classification
Mirza, 2015, Ensemble of subset online sequential extreme learning machine for class imbalance and concept drift, Neurocomputing, 149, 316, 10.1016/j.neucom.2014.03.075
Mirza, 2013, Weighted online sequential extreme learning machine for class imbalance learning, Neural processing letters, 38, 465, 10.1007/s11063-013-9286-9
Moepya, 2014, Applying Cost-Sensitive Classification for Financial Fraud Detection under High Class-Imbalance
Moreo, 2016, Distributional Random Oversampling for Imbalanced Text Classification
Motoda, 2002, Feature selection, extraction and construction, Vol 5, 67
Nagi, 2008, Detection of abnormalities and electricity theft using genetic support vector machines
Napierala, 2015, Types of minority class examples and their influence on learning classifiers from imbalanced data, Journal of Intelligent Information Systems, 1
Napierała, 2015, Addressing imbalanced data with argument based rule learning, Expert Systems with Applications, 42, 9468, 10.1016/j.eswa.2015.07.076
Natwichai, 2005, Hiding classification rules for data sharing with privacy preservation, 468
Nekooeimehr, 2016, Adaptive semi-unsupervised weighted oversampling (A-SUWO) for imbalanced datasets, Expert Systems with Applications, 46, 405, 10.1016/j.eswa.2015.10.031
Ng, 2016, Dual autoencoders features for imbalance classification problem, Pattern Recognition, 60, 875, 10.1016/j.patcog.2016.06.013
Niehaus, 2014, MVPA to enhance the study of rare cognitive events: An investigation of experimental PTSD
Oh, 2011, Ensemble learning with active example selection for imbalanced biomedical data classification, IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 8, 316, 10.1109/TCBB.2010.96
Oh, 2011, Error back-propagation algorithm for classification of imbalanced data, Neurocomputing, 74, 1058, 10.1016/j.neucom.2010.11.024
Olszewski, 2012, A probabilistic approach to fraud detection in telecommunications, Knowledge-Based Systems, 26, 246, 10.1016/j.knosys.2011.08.018
Pai, 2011, A support vector machine-based model for detecting top management fraud, Knowledge-Based Systems, 24, 314, 10.1016/j.knosys.2010.10.003
Pan, 2011, Soft margin keyframe comparison: Enhancing precision of fraud detection in retail surveillance
Panigrahi, 2009, Credit card fraud detection: A fusion approach using Dempster–Shafer theory and Bayesian learning, Information Fusion, 10, 354, 10.1016/j.inffus.2008.04.001
Park, 2014, Ensembles of $({alpha}) $-Trees for Imbalanced Classification Problems, Knowledge and Data Engineering, IEEE Transactions on, 26, 131, 10.1109/TKDE.2012.255
Peng, 2014, Ensemble-based hybrid probabilistic sampling for imbalanced data learning in lung nodule CAD, Computerized Medical Imaging and Graphics, 38, 137, 10.1016/j.compmedimag.2013.12.003
Pérez-Godoy, 2010, Analysis of an evolutionary RBFN design algorithm, CO 2 RBFN, for imbalanced data sets, Pattern Recognition Letters, 31, 2375, 10.1016/j.patrec.2010.07.010
Phoungphol, 2012, Robust multiclass classification for learning from imbalanced biomedical data, Tsinghua Science and technology, 17, 619, 10.1109/TST.2012.6374363
Prusa, 2016, Enhancing Ensemble Learners with Data Sampling on High-Dimensional Imbalanced Tweet Sentiment Data
Raj, 2016, Towards effective classification of imbalanced data with convolutional neural networks
Ramentol, 2015, IFROWANN: imbalanced fuzzy-rough ordered weighted average nearest neighbor classification, Fuzzy Systems, IEEE Transactions on, 23, 1622, 10.1109/TFUZZ.2014.2371472
Raposo, 2016, Lopinavir Resistance Classification with Imbalanced Data Using Probabilistic Neural Networks, Journal of medical systems, 40, 1, 10.1007/s10916-015-0428-7
Razavian, 2014, CNN features off-the-shelf: an astounding baseline for recognition
Ren, 2016, Ensemble based adaptive over-sampling method for imbalanced data learning in computer aided detection of microaneurysm, Computerized Medical Imaging and Graphics
Ren, 2016, Influential factors of red-light running at signalized intersection and prediction using a rare events logistic regression model, Accident Analysis & Prevention, 95, 266, 10.1016/j.aap.2016.07.017
Richardson, 2013, Infection status outcome, machine learning method and virus type interact to affect the optimised prediction of hepatitis virus immunoassay results from routine pathology laboratory assays in unbalanced data, BMC bioinformatics, 14, 1, 10.1093/bib/bbs007
Rodriguez, 2014, Preliminary comparison of techniques for dealing with imbalance in software defect prediction
Saeys, 2007, A review of feature selection techniques in bioinformatics, bioinformatics, 23, 2507, 10.1093/bioinformatics/btm344
Sáez, 2015, SMOTE–IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering, Information Sciences, 291, 184, 10.1016/j.ins.2014.08.051
Sahin, 2013, A cost-sensitive decision tree approach for fraud detection, Expert Systems with Applications, 40, 5916, 10.1016/j.eswa.2013.05.021
Sanz, 2015, A compact evolutionary interval-valued fuzzy rule-based classification system for the modeling and prediction of real-world financial applications with imbalanced data, Fuzzy Systems, IEEE Transactions on, 23, 973, 10.1109/TFUZZ.2014.2336263
Schapire, 1999, Improved boosting algorithms using confidence-rated predictions, Machine learning, 37, 297, 10.1023/A:1007614523901
Seiffert, 2010, RUSBoost: A hybrid approach to alleviating class imbalance, Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, 40, 185, 10.1109/TSMCA.2009.2029559
Shao, 2014, An efficient weighted Lagrangian twin support vector machine for imbalanced data classification, Pattern Recognition, 47, 3158, 10.1016/j.patcog.2014.03.008
Song, 2016, A bi-directional sampling based on K-means method for imbalance text classification
Song, 2014, nDNA-prot: identification of DNA-binding proteins based on unbalanced classification, BMC bioinformatics, 15, 1, 10.1186/1471-2105-15-298
Su, 2007, An evaluation of the robustness of MTS for imbalanced data, IEEE Transactions on Knowledge and Data Engineering, 19, 1321, 10.1109/TKDE.2007.190623
Subudhi, 2015, Quarter-Sphere Support Vector Machine for Fraud Detection in Mobile Telecommunication Networks, Procedia Computer Science, 48, 353, 10.1016/j.procs.2015.04.193
Sultana, 2012, Enhancing the performance of decision tree: A research study of dealing with unbalanced data
Sun, 2010, Algorithms for rare event analysis in nano-CMOS circuits using statistical blockade
Sun, 2006, Boosting for learning multiple classes with imbalanced class distribution
Sun, 2007, Cost-sensitive boosting for classification of imbalanced data, Pattern Recognition, 40, 3358, 10.1016/j.patcog.2007.04.009
Sun, 2009, Classification of imbalanced data: A review, International Journal of Pattern Recognition and Artificial Intelligence, 23, 687, 10.1142/S0218001409007326
Sun, 2015, A novel ensemble method for classifying imbalanced data, Pattern Recognition, 48, 1623, 10.1016/j.patcog.2014.11.014
Tahir, 2009, A multiple expert approach to the class imbalance problem using inverse random under sampling, 82
Tajik, 2015, Gas turbine shaft unbalance fault detection by using vibration data and neural networks
Tan, 2015, Online defect prediction for imbalanced data, Volume 2
Tan, 2015, Evolutionary fuzzy ARTMAP neural networks for classification of semiconductor defects, Neural Networks and Learning Systems, IEEE Transactions on, 26, 933, 10.1109/TNNLS.2014.2329097
Taneja, 2015, Prediction of click frauds in mobile advertising
Tian, 2011, Imbalanced classification using support vector machine ensemble, Neural Computing and Applications, 20, 203, 10.1007/s00521-010-0349-9
Tomek, 1976, A generalization of the k-NN rule, Systems, Man and Cybernetics, IEEE Transactions on, 121, 10.1109/TSMC.1976.5409182
Topouzelis, 2008, Oil spill detection by SAR images: dark formation detection, feature extraction and classification algorithms, Sensors, 8, 6642, 10.3390/s8106642
Trafalis, 2014, Machine-learning classifiers for imbalanced tornado data, Computational Management Science, 11, 403, 10.1007/s10287-013-0174-6
Tsai, 2009, Forecasting of ozone episode days by cost-sensitive neural network methods, Science of the Total Environment, 407, 2124, 10.1016/j.scitotenv.2008.12.007
Vajda, 2010, Strategies for training robust neural network based digit recognizers on unbalanced data sets
Vani, 2014, Multiclass unbalanced protein data classification using sequence features
Verbeke, 2012, New insights into churn prediction in the telecommunication sector: A profit driven data mining approach, European Journal of Operational Research, 218, 211, 10.1016/j.ejor.2011.09.031
Vigneron, 2015, A multi-scale seriation algorithm for clustering sparse imbalanced data: application to spike sorting, Pattern Analysis and Applications, 1
Vluymans, 2015, Fuzzy rough classifiers for class imbalanced multi-instance data, Pattern Recognition
Vo, 2007, Classification of unbalanced medical data with weighted regularized least squares
Voigt, 2014, Threshold optimization for classification in imbalanced data in a problem of gamma-ray astronomy, Advances in Data Analysis and Classification, 8, 195, 10.1007/s11634-014-0167-5
Vong, 2015, Imbalanced Learning for Air Pollution by Meta-Cognitive Online Sequential Extreme Learning Machine, Cognitive Computation, 7, 381, 10.1007/s12559-014-9301-0
Vorobeva, 2016, Examining the performance of classification algorithms for imbalanced data sets in web author identification
Wan, 2014, Learning to improve medical decision making from imbalanced data without a priori cost, BMC medical informatics and decision making, 14, 1, 10.1186/s12911-014-0111-9
Wang, 2010, Boosting support vector machines for imbalanced data sets, Knowledge and Information Systems, 25, 1, 10.1007/s10115-009-0198-y
Wang, 2014, Cost-sensitive online classification, IEEE Transactions on Knowledge and Data Engineering, 26, 2425, 10.1109/TKDE.2013.157
Wang, 2010, Negative correlation learning for classification ensembles
Wang, 2013, A learning framework for online class imbalance learning
Wang, 2014, A multi-objective ensemble method for online class imbalance learning
Wang, 2015, Resampling-based ensemble methods for online class imbalance learning, Knowledge and Data Engineering, IEEE Transactions on, 27, 1356, 10.1109/TKDE.2014.2345380
Wang, 2009, Diversity analysis on imbalanced data sets by using ensemble models
Wang, 2012, Multiclass imbalance problems: Analysis and potential solutions, Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 42, 1119, 10.1109/TSMCB.2012.2187280
Wang, 2013, Using class imbalance learning for software defect prediction, Reliability, IEEE Transactions on, 62, 434, 10.1109/TR.2013.2259203
Wang, 2016, Probabilistic framework of visual anomaly detection for unbalanced data, Neurocomputing
Wang, 2015, Detecting Rare Actions and Events from Surveillance Big Data with Bag of Dynamic Trajectories
Wang, 2016, Distributed Weighted Extreme Learning Machine for Big Imbalanced Data Learning, Volume 1, 319
Wasikowski, 2010, Combating the small sample class imbalance problem using feature selection, Knowledge and Data Engineering, IEEE Transactions on, 22, 1388, 10.1109/TKDE.2009.187
Wei, 2013, Discovering medical quality of total hip arthroplasty by rough set classifier with imbalanced class, Quality & Quantity, 47, 1761, 10.1007/s11135-011-9624-9
Wei, 2013, Effective detection of sophisticated online banking fraud on extremely imbalanced data, World Wide Web, 16, 449, 10.1007/s11280-012-0178-0
Weiss, 2004, Mining with rarity: a unifying framework, ACM SIGKDD Explorations Newsletter, 6, 7, 10.1145/1007730.1007734
Weiss, 2000, Learning to predict extremely rare events
Wen, 2015, Abnormal event detection via adaptive cascade dictionary learning
Wilk, 2016, Application of Preprocessing Methods to Imbalanced Clinical Data: An Experimental Study, 503
Wu, 2016, Mixed-kernel based weighted extreme learning machine for inertial sensor based human activity recognition with imbalanced dataset, Neurocomputing, 190, 35, 10.1016/j.neucom.2015.11.095
Wu, 2016, E-commerce customer churn prediction based on improved SMOTE and AdaBoost
Xiao, 2016, Imbalanced Extreme Learning Machine for Classification with Imbalanced Data Distributions, Volume 2, 503
Xin, 2011, A new classification method for LIDAR data based on unbalanced support vector machine
Xiong, 2014, Collaborative web service QoS prediction on unbalanced data distribution
Xu, 2016, Detecting rare events using Kullback–Leibler divergence: A weakly supervised approach, Expert Systems with Applications, 54, 13, 10.1016/j.eswa.2016.01.035
Xu, 2014, Real-time video event detection in crowded scenes using MPEG derived features: A multiple instance learning approach, Pattern Recognition Letters, 44, 113, 10.1016/j.patrec.2013.11.019
Xu, 2007, Power distribution fault cause identification with imbalanced data using the data mining-based fuzzy classification e-algorithm, Power Systems, IEEE Transactions on, 22, 164, 10.1109/TPWRS.2006.888990
Xu, 2007, Power distribution outage cause identification with imbalanced data using artificial immune recognition system (AIRS) algorithm, Power Systems, IEEE Transactions on, 22, 198, 10.1109/TPWRS.2006.889040
Xu, 2015, A maximum margin and minimum volume hyper-spheres machine with pinball loss for imbalanced data classification, Knowledge-Based Systems
Qing, 2015, The prediction method of material consumption for electric power production based on PCBoost and SVM, 1256
Yang, 2016, Iterative ensemble feature selection for multiclass classification of imbalanced microarray data, Journal of Biological Research-Thessaloniki, 23, 13, 10.1186/s40709-016-0045-8
Yang, 2009, A particle swarm based hybrid system for imbalanced medical data sampling, BMC genomics, 10, 1, 10.1186/1471-2164-10-S1-I1
Yang, 2016, Automated Identification of High Impact Bug Reports Leveraging Imbalanced Learning Strategies
Yeh, 2016, A Learning Approach with Under-and Over-Sampling for Imbalanced Data Sets
Yi, 2010, The Cascade Decision-Tree Improvement Algorithm Based on Unbalanced Data Set
Yu, 2015, Support vector machine-based optimized decision threshold adjustment strategy for classifying imbalanced data, Knowledge-Based Systems, 76, 67, 10.1016/j.knosys.2014.12.007
Yu, 2012, Mining and integrating reliable decision rules for imbalanced cancer gene expression data sets, Tsinghua Science and technology, 17, 666, 10.1109/TST.2012.6374368
Yu, 2016, ODOC-ELM: Optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data, Knowledge-Based Systems, 92, 55, 10.1016/j.knosys.2015.10.012
Yun, 2016, Automatic Determination of Neighborhood Size in SMOTE
Zakaryazad, 2016, A profit-driven Artificial Neural Network (ANN) with applications to fraud detection and direct marketing, Neurocomputing, 175, 121, 10.1016/j.neucom.2015.10.042
Zhai, 2015, The classification of imbalanced large data sets based on MapReduce and ensemble of ELM classifiers, International Journal of Machine Learning and Cybernetics, 1
Zhang, 2008, Toward a comprehensive model in internet auction fraud detection
Zhang, 2016, An imbalanced data classification algorithm of improved autoencoder neural network
Zhang, 2015, An ensemble method for unbalanced sentiment classification
Zhang, 2009, Fraud Detection in Tax Declaration Using Ensemble ISGNN
Zhang, 2016, Cost-sensitive spectral clustering for photo-thermal infrared imaging data
Zhang, 2015, Intelligent fault diagnosis of roller bearings with multivariable ensemble-based incremental support vector machine, Knowledge-Based Systems, 89, 56, 10.1016/j.knosys.2015.06.017
Zhang, 2015, Boosting mobile Apps under imbalanced sensing data, Mobile Computing, IEEE Transactions on, 14, 1151, 10.1109/TMC.2014.2345053
Zhang, X., Y. Zhuang, H. Hu and W. Wang (2015d). "3-D Laser-Based Multiclass and Multiview Object Detection in Cluttered Indoor Scenes."
Zhang, 2014, Imbalanced data classification based on scaling kernel-based support vector machine, Neural Computing and Applications, 25, 927, 10.1007/s00521-014-1584-2
Zhang, 2012, Using ensemble methods to deal with imbalanced data in predicting protein–protein interactions, Computational Biology and Chemistry, 36, 36, 10.1016/j.compbiolchem.2011.12.003
Zhang, 2016, Empowering one-vs-one decomposition with ensemble learning for multi-class imbalanced data, Knowledge-Based Systems, 10.1016/j.knosys.2016.05.048
Zhao, 2008, Protein classification with imbalanced data, Proteins: Structure, function, and bioinformatics, 70, 1125, 10.1002/prot.21870
Zhao, 2011, Learning SVM with weighted maximum margin criterion for classification of imbalanced data, Mathematical and Computer Modelling, 54, 1093, 10.1016/j.mcm.2010.11.040
Zhong, 2013, Classifying peer-to-peer applications using imbalanced concept-adapting very fast decision tree on IP data stream, Peer-to-Peer Networking and Applications, 6, 233, 10.1007/s12083-012-0147-5
Zhou, 2013, Performance of corporate bankruptcy prediction models on imbalanced dataset: The effect of sampling methods, Knowledge-Based Systems, 41, 16, 10.1016/j.knosys.2012.12.007
2016
Zhou, 2006, Training cost-sensitive neural networks with methods addressing the class imbalance problem, Knowledge and Data Engineering, IEEE Transactions on, 18, 63, 10.1109/TKDE.2006.17
Zhu, 2009, Introduction to semi-supervised learning, Synthesis lectures on artificial intelligence and machine learning, 3, 1, 10.2200/S00196ED1V01Y200906AIM006
Zięba, 2015, Boosted SVM with active learning strategy for imbalanced data, Soft Computing, 19, 3357, 10.1007/s00500-014-1407-5
Zięba, 2014, Boosted SVM for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients, Applied Soft Computing, 14, 99, 10.1016/j.asoc.2013.07.016