Outlier ensembles

Association for Computing Machinery (ACM) - Tập 14 Số 2 - Trang 49-58 - 2013
Charų C. Aggarwal1
1IBM -- T. J. Watson Research Center, Yorktown Heights, NY

Tóm tắt

Ensemble analysis is a widely used meta-algorithm for many data mining problems such as classification and clustering. Numerous ensemble-based algorithms have been proposed in the literature for these problems. Compared to the clustering and classification problems, ensemble analysis has been studied in a limited way in the outlier detection literature. In some cases, ensemble analysis techniques have been implicitly used by many outlier analysis algorithms, but the approach is often buried deep into the algorithm and not formally recognized as a general-purpose meta-algorithm. This is in spite of the fact that this problem is rather important in the context of outlier analysis. This paper discusses the various methods which are used in the literature for outlier ensembles and the general principles by which such analysis can be made more effective. A discussion is also provided on how outlier ensembles relate to the ensemble-techniques used commonly for other data mining problems.

Từ khóa


Tài liệu tham khảo

C. C. Aggarwal. Outlier Analysis Springer 2013. C. C. Aggarwal. Outlier Analysis Springer 2013.

Aggarwal C. C., 2013, CRC Press

10.1145/375663.375668

10.1145/304182.304188

10.5555/645806.670167

10.1145/952532.952616

10.5555/1032649.1033432

10.1023/A:1010933404324

10.1023/A:1018054314350

10.1145/1541880.1541882

10.1145/956750.956758

10.1145/342009.335388

Chawla N., 2003, SMOTEBoost: Improving prediction of the minority class in boosting, PKDD, 107

10.1162/153244304773936090

Domingos P., 2000, Bayesian Averaging of Classifiers and the Overfitting Problem. ICML Conference

10.5555/646943.712093

10.1109/ICDM.2006.43

10.1007/11563952_56

10.1007/978-94-015-3994-4

10.1109/38.788795

10.1145/502512.502554

Johnson T., 1998, ACM KDD Conference

10.5555/645496.658037

10.1109/ICDE.2012.88

Kriegel H., 2011, Interpreting and Unifying Outlier Scores. SDM Conference

Knorr E., 1998, Algorithms for Mining Distancebased Outliers in Large Datasets. VLDB Conference

Knorr E., 1999, Finding Intensional Knowledge of Distance-Based Outliers. VLDB Conference

10.1145/1081870.1081891

10.1109/ICDM.2008.17

10.1109/ICDE.2011.5767916

10.1109/ICDM.2010.85

10.1109/ICDE.2012.142

10.1109/ICDM.2012.112

10.1007/978-3-642-12026-8_29

10.1109/ICDE.2003.1260802

10.1145/342009.335437

10.1023/A:1007511322260

10.1016/S0893-6080(05)80023-1

10.1023/B:MACH.0000015881.36452.6e