Attribute augmented and weighted naive Bayes

Springer Science and Business Media LLC - Tập 65 - Trang 1-14 - 2022
Huan Zhang1, Liangxiao Jiang1,2, Chaoqun Li3
1School of Computer Science, China University of Geosciences, Wuhan, China
2Key Laboratory of Artificial Intelligence, Ministry of Education, Shanghai, China
3School of Mathematics and Physics, China University of Geosciences, Wuhan, China

Tóm tắt

Numerous enhancements have been proposed to mitigate the attribute conditional independence assumption in naive Bayes (NB). However, almost all of them only focus on the original attribute space. Due to the complexity of real-world applications, we argue that the discriminative information provided by the original attribute space might be insufficient for classification. Thus, in this study, we expect to discover some latent attributes beyond the original attribute space and propose a novel two-stage model called attribute augmented and weighted naive Bayes (A2WNB). At the first stage, we build multiple random one-dependence estimators (RODEs). Then we use each built RODE to classify each training instance in turn and define the predicted class labels as its latent attributes. At last, we construct the augmented attributes by concatenating the latent attributes with the original attributes. At the second stage, to alleviate the attribute redundancy, we optimize the augmented attributes’ weights by maximizing the conditional log-likelihood (CLL) of the built model. Extensive experimental results show that A2WNB significantly outperforms NB and all the other existing state-of-the-art competitors.

Tài liệu tham khảo

Wu X, Kumar V, Quinlan J R, et al. Top 10 algorithms in data mining. Knowl Inf Syst, 2008, 14: 1–37 Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn, 1997, 29: 131–163 Webb G I, Boughton J R, Wang Z H. Not so naive Bayes: aggregating one-dependence estimators. Mach Learn, 2005, 58: 5–24 Jiang L X, Zhang H, Cai Z H. A novel Bayes model: hidden naive Bayes. IEEE Trans Knowl Data Eng, 2009, 21: 1361–1371 Qiu C, Jiang L X, Li C Q. Not always simple classification: learning superparent for class probability estimation. Expert Syst Appl, 2015, 42: 5433–5440 Kohavi R. Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, 1996. 202–207 Frank E, Hall M A, Pfahringer B. Locally weighted naive bayes. In: Proceedings of the 19th Conference in Uncertainty in Artificial Intelligence, 2003. 249–256 Wang S S, Jiang L X, Li C Q. Adapting naive Bayes tree for text classification. Knowl Inf Syst, 2015, 44: 77–89 Jiang L X, Wang D H, Cai Z H. Discriminatively weighted naive bayes and its application in text classification. Int J Artif Intell Tools, 2012, 21: 1250007 Jiang L X, Qiu C, Li C Q. A novel minority cloning technique for cost-sensitive learning. Int J Patt Recogn Artif Intell, 2015, 29: 1551004 Xu W Q, Jiang L X, Yu L J. An attribute value frequency-based instance weighting filter for naive Bayes. J Exp Theor Artif Intell, 2019, 31: 225–236 Langley P, Sage S. Induction of selective bayesian classifiers. In: Proceedings of the 10th Annual Conference on Uncertainty in Artificial Intelligence, 1994. 399–406 Chen S, Martinez A M, Webb G I. Highly scalable attribute selection for averaged one-dependence estimators. In: Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2014. 86–97 Chen S, Webb G I, Liu L, et al. A novel selective naïve Bayes algorithm. Knowl-Based Syst, 2020, 192: 105361 Hall M. A decision tree-based attribute weighting filter for naive Bayes. Knowl-Based Syst, 2007, 20: 120–126 Zaidi N A, Cerquides J, Carman M J, et al. Alleviating naive bayes attribute independence assumption by attribute weighting. J Mach Learn Res, 2013, 14: 1947–1988 Jiang L X, Zhang L G, Li C Q, et al. A correlation-based feature weighting filter for naive Bayes. IEEE Trans Knowl Data Eng, 2019, 31: 201–213 Hindi K E. Fine tuning the naïve Bayesian learning algorithm. AI Commun, 2014, 27: 133–141 Diab D M, Hindi K E. Using differential evolution for fine tuning naïve Bayesian classifiers and its application for text classification. Appl Soft Comput, 2017, 54: 183–199 Hindi K E, Aljulaidan R R, AlSalman H. Lazy fine-tuning algorithms for naïve Bayesian text classification. Appl Soft Comput, 2020, 96: 106652 Chen S L, Martinez A M, Webb G I, et al. Sample-based attribute selective An DE for large data. IEEE Trans Knowl Data Eng, 2017, 29: 172–185 Zhang H, Jiang L X, Yu L J. Attribute and instance weighted naive Bayes. Pattern Recogn, 2021, 111: 107674 Duan Z Y, Wang L M, Chen S L, et al. Instance-based weighting filter for superparent one-dependence estimators. Knowl-Based Syst, 2020, 203: 106085 Zhang H, Petitjean F, Buntine W. Bayesian network classifiers using ensembles and smoothing. Knowl Inf Syst, 2020, 62: 3457–3480 Liu Y, Wang L M, Mammadov M. Learning semi-lazy Bayesian network classifier under the c.i.i.d assumption. Knowl-Based Syst, 2020, 208: 106422 Long Y G, Wang L M, Duan Z Y, et al. Robust structure learning of Bayesian network by identifying significant dependencies. IEEE Access, 2019, 7: 116661 Jiang L X. Random one-dependence estimators. Pattern Recogn Lett, 2011, 32: 532–539 Wu J, Pan S R, Zhu X Q, et al. Self-adaptive attribute weighting for naive Bayes classification. Expert Syst Appl, 2015, 42: 1487–1502 Jiang L X, Li C Q, Wang S S, et al. Deep feature weighting for naive Bayes and its application to text classification. Eng Appl Artifi Intell, 2016, 52: 26–39 Lee C H. A gradient approach for value weighted classification learning in naive Bayes. Knowl-Based Syst, 2015, 85: 71–79 Lee C H. An information-theoretic filter approach for value weighted classification learning in naive Bayes. Data Knowl Eng, 2018, 113: 116–128 Zhang H, Sheng S L. Learning weighted naive bayes with accurate ranking. In: Proceedings of the 4th International Conference on Data Mining, 2004. 567–570 Jiang L X, Zhang L G, Yu L J, et al. Class-specific attribute weighted naive Bayes. Pattern Recogn, 2019, 88: 321–330 Zhang H, Jiang L X, Yu L J. Class-specific attribute value weighting for naive Bayes. Inf Sci, 2020, 508: 260–274 Mahmoudi A, Yaakub M R, Bakar A A. The relationship between online social network ties and user attributes. ACM Trans Knowl Discov Data, 2019, 13: 26 Ali S, Shakeel M H, Khan I, et al. Predicting attributes of nodes using network structure. ACM Trans Intell Syst Technol, 2021, 12: 21 Jiang L X, Cai Z H, Zhang H, et al. Not so greedy: randomly selected naive Bayes. Expert Syst Appl, 2012, 39: 11022–11028 Wu J, Cai Z H. Attribute weighting via differential evolution algorithm for attribute weighted naive bayes (WNB). J Comput Inform Syst, 2011, 7: 1672–1679 Zhu C Y, Byrd R H, Lu P, et al. Algorithm 778: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw, 1997, 23: 550–560 Breiman L. Random forests. Mach Learn, 2001, 45: 5–32 Witten I H, Frank E, Hall M A. Data Mining: Practical Machine Learning Tools and Techniques. 3rd ed. Amsterdam: Elsevier, 2011 Bengio Y, Nadeau C. Inference for the generalization error. Mach Learn, 2003, 52: 239–281 Demsar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res, 2006, 7: 1–30 Olave M, Rajkovic V, Bohanec M. An application for admission in public school systems. Expert Syst Public Admin, 1989, 1: 145–160