BAMB

ACM Transactions on Intelligent Systems and Technology - Tập 10 Số 5 - Trang 1-25 - 2019
Hao Wang1, Kui Yu1, Hao Wang1, Lin Liu2, Wei Ding3, Xindong Wu4
1Hefei University of Technology, Hefei, Anhui, China
2University of South Australia, Adelaide, SA, Australia
3University of Massachusetts Boston, Boston, MA, USA
4Mininglamp Technology, Beijing, China

Tóm tắt

The discovery of Markov blanket (MB) for feature selection has attracted much attention in recent years, since the MB of the class attribute is the optimal feature subset for feature selection. However, almost all existing MB discovery algorithms focus on either improving computational efficiency or boosting learning accuracy, instead of both. In this article, we propose a novel MB discovery algorithm for balancing efficiency and accuracy, called <underline>BA</underline>lanced <underline>M</underline>arkov <underline>B</underline>lanket (BAMB) discovery. To achieve this goal, given a class attribute of interest, BAMB finds candidate PC (parents and children) and spouses and removes false positives from the candidate MB set in one go. Specifically, once a feature is successfully added to the current PC set, BAMB finds the spouses with regard to this feature, then uses the updated PC and the spouse set to remove false positives from the current MB set. This makes the PC and spouses of the target as small as possible and thus achieves a trade-off between computational efficiency and learning accuracy. In the experiments, we first compare BAMB with 8 state-of-the-art MB discovery algorithms on 7 benchmark Bayesian networks, then we use 10 real-world datasets and compare BAMB with 12 feature selection algorithms, including 8 state-of-the-art MB discovery algorithms and 4 other well-established feature selection methods. On prediction accuracy, BAMB outperforms 12 feature selection algorithms compared. On computational efficiency, BAMB is close to the IAMB algorithm while it is much faster than the remaining seven MB discovery algorithms.

Từ khóa


Tài liệu tham khảo

Aliferis Constantin F., 2010, Koutsoukos

Aliferis Constantin F., 2003, Proceedings of the AMIA Annual Symposium Proceedings. American Medical Informatics Association, 21

Beinlich Ingo A., Proceedings of the Conference on Artificial Intelligence in Medicine (AIME’89)

10.1023/A:1007421730016

10.1007/s10115-017-1140-3

10.1016/j.patcog.2010.10.023

10.1145/1961189.1961199

10.1109/TPAMI.2010.195

A. P. Dawid R. G. Cowell S. L. Lauritzen and D. J. Spiegelhalter. 1999. Probabilistic Networks and Expert Systems. Springer-Verlag. A. P. Dawid R. G. Cowell S. L. Lauritzen and D. J. Spiegelhalter. 1999. Probabilistic Networks and Expert Systems. Springer-Verlag.

Dua Dheeru and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml. Dua Dheeru and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.

Fu Shunkai, 2008, Desmarais

10.1109/TCYB.2016.2539338

10.1016/j.ijar.2016.09.009

Hitt Ben, 2006, Multiple high-resolution serum proteomic features for ovarian cancer detection, U.S. Patent App., 11, 018

Margaritis Dimitris, Advances in Neural Information Processing Systems

Niinimki T., 2012, Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (UAI’12)

Judea Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann series in representation and reasoning. Judea Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann series in representation and reasoning.

Pearl Judea, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference

10.1016/j.ijar.2006.06.008

10.1109/TPAMI.2005.159

Silander Tomi, 2006, Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence (UAI’06)

Spirtes Peter, Prediction, and Search

Statnikov A., 2003, Technical Report DSL-03-01

10.1145/956750.956838

Tsamardinos Ioannis, 2003, Proceedings of the International Conference of the Florida Artificial Intelligence Research Society (FLAIRS’03), 2

10.1007/s10994-006-6889-7

10.1007/s10115-017-1131-4

10.1109/ICDM.2005.134

Yu Kui, 2018, A unified view of causal and non-causal feature selection, Arxiv Preprint Arxiv, 1802, 05844

Yu Kui, 2019, Multi-source causal feature selection, IEEE Trans. Pattern Anal. Mach. Intell. DOI, 10

10.1145/2976744

Yu Lei, 2004, Efficient feature selection via analysis of relevance and redundancy, J. Mach. Learn. Res. 5

10.1109/TKDE.2011.222