Learning from streaming data with concept drift and imbalance: an overview

Progress in Artificial Intelligence - Tập 1 Số 1 - Trang 89-101 - 2012

T. Ryan Hoens¹, Robi Polikar², Nitesh V. Chawla¹

¹University of Notre Dame.

²Rowan University

Tóm tắt

Từ khóa

Tài liệu tham khảo

Alippi, C., Boracchi, G., Roveri, M.: Just in time classifiers: managing the slow drift case. In: IJCNN, pp. 114–120. IEEE, New York (2009). doi: 10.1109/IJCNN.2009.5178799

Alippi, C., Roveri, M.: Just-in-time adaptive classifiers in non-stationary conditions. In: IJCNN, pp. 1014–1019. IEEE, New York (2007)

Alippi C., Roveri M.: Just-in-time adaptive classifierspart ii: designing the classifier. TNN 19(12), 2053–2064 (2008)

Andres-Andres, A., Gomez-Sanchez, E., Bote-Lorenzo, M.: Incremental rule pruning for fuzzy artmap neural network. In: ICANN, pp. 655–660 (2005)

Becker, H., Arias, M.: Real-time ranking with concept drift using expert advice. In: KDD, pp. 86–94. ACM, New York (2007)

Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: SDM, pp. 443–448 (Citeseer) (2007)

Bifet, A., Gavalda, R.: Adaptive learning from evolving data streams. In: IDA, pp. 249–260 (2009)

Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: KDD, pp. 139–148. ACM, New York (2009)

Black M., Hickey R.: Learning classification rules for telecom customer call data under concept drift. Soft Comput. Fusion Found. Methodol. Appl. 8(2), 102–108 (2003)

Breiman L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). doi: 10.1023/A:1018054314350

Breiman L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). doi: 10.1023/A:1010933404324

Buntine W.: Learning classification trees. Stat. Comput. 2(2), 63–73 (1992)

Carpenter G., Grossberg S., Markuzon N., Reynolds J., Rosen D.: Fuzzy artmap: a neural network architecture for incremental supervised learning of analog multidimensional maps. TNN 3(5), 698–713 (1992)

Carpenter G., Grossberg S., Reynolds J.: Artmap: supervised real-time learning and classification of nonstationary data by a self-organizing neural network. Neural Netw. 4(5), 565–588 (1991)

Carpenter G., Tan A.: Rule extraction: from neural architecture to symbolic representation. Connect. Sci. 7(1), 3–27 (1995)

Chawla N., Japkowicz N., Kotcz A.: Editorial: special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newsl. 6(1), 1–6 (2004)

Chawla, N., Lazarevic, A., Hall, L., Bowyer, K.: Smoteboost: improving prediction of the minority class in boosting. In: PKDD, pp. 107–119 (2003)

Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 875–886. Springer, Berlin (2010)

Chawla N.V., Cieslak D.A., Hall L.O., Joshi A.: Automatically countering imbalance and its empirical relationship to cost. DMKD 17(2), 225–252 (2008)

Chen, S., He, H.: Sera: selectively recursive approach towards nonstationary imbalanced stream data mining. In: IJCNN, pp. 522–529. IEEE, New York (2009)

Chu, F., Zaniolo, C.: Fast and light boosting for adaptive mining of data streams. In: PAKDD, pp. 282–292 (2004)

Dietterich, T.: Ensemble methods in machine learning. In: MCS, pp. 1–15 (2000)

Ditzler, G., Polikar, R.: An incremental learning framework for concept drift and class imbalance. In: IJCNN. IEEE, New York (2010)

Ditzler, G., Polikar, R., Chawla, N.V.: An incremental learning algorithm for nonstationary environments and class imbalance. In: ICPR. IEEE, New York (2010)

Domingos, P., Hulten, G.: Mining high-speed data streams. In: KDD, pp. 71–80. ACM, New York (2000)

Elwell, R., Polikar, R.: Incremental learning in nonstationary environments with controlled forgetting. In: IJCNN, pp. 771–778. IEEE, New York (2009)

Elwell, R., Polikar, R.: Incremental learning of variable rate concept drift. In: MCS, pp. 142–151 (2009)

Elwell R., Polikar R.: Incremental learning of concept drift in nonstationary environments. TNN 22(10), 1517–1531 (2011)

Fan, W.: Systematic data selection to mine concept-drifting data streams. In: KDD, pp. 128–137. ACM, New York (2004)

Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: ICML (1996). doi: 10.1007/3-540-59119-2_166

Friedman J., Hastie T., Tibshirani R.: Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat. 28(2), 337–407 (2000)

Fu L.: Incremental knowledge acquisition in supervised learning networks. SMC Part A 26(6), 801–809 (2002)

Fukunaga K., Hostetler L.: Optimization of k nearest neighbor density estimates. Inf. Theory 19(3), 320–326 (2002)

Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: AAI, pp. 66–112 (2004)

Gao J., Ding B., Fan W., Han J., Yu P.: Classifying data streams with skewed class distributions and concept drifts. Internet Comput. 12(6), 37–49 (2008)

Gao, J., Fan, W., Han, J., Yu, P.: A general framework for mining concept-drifting data streams with skewed distributions. In: SDM, pp. 3–14 (Citeseer) (2007)

Giraud-Carrier C.: A note on the utility of incremental learning. AI Commun. 13(4), 215–223 (2000)

Grossberg S.: Nonlinear neural networks: principles, mechanisms, and architectures. Neural Netw. 1(1), 17–61 (1988)

Guo, H., Viktor, H.L.: Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. SIGKDD Explor. Newsl. 6, 30–39 (2004). doi: 10.1145/1007730.1007736

Ho T.: The random subspace method for constructing decision forests. PAMI 20(8), 832–844 (1998)

Hoeffding W.: Probability inequalities for sums of bounded random variables. JASA 58(301), 13–30 (1963)

Hoeglinger, S., Pears, R.: Use of hoeffding trees in concept based data stream mining. In: ICIAFS, pp. 57–62 (2007). doi: 10.1109/ICIAFS.2007.4544780

Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: KDD, pp. 97–106. ACM, New York (2001)

Joachims, T.: Estimating the generalization performance of an svm efficiently. In: ICML, p. 431. Morgan Kaufmann, Menlo Park (2000)

Karnick, M., Ahiskali, M., Muhlbaier, M., Polikar, R.: Learning concept drift in nonstationary environments using an ensemble of classifiers based approach. In: IJCNN, pp. 3455–3462. IEEE, New York (2008)

Karnick, M., Muhlbaier, M., Polikar, R.: Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach. In: ICPR, pp. 1–4. IEEE, New York (2009)

Kelly, M., Hand, D., Adams, N.: The impact of changing populations on classifier performance. In: KDD, pp. 367–371. ACM, New York (1999)

Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: ICML (Citeseer) (2000)

Kohavi, R., Kunz, C.: Option decision trees with majority votes. In: ICML, pp. 161–169. Morgan Kaufmann, Menlo Park (1997)

Kolter, J., Maloof, M.: Dynamic weighted majority: a new ensemble method for tracking concept drift. In: ICDM, pp. 123–130. IEEE, New York (2003)

Kolter, J., Maloof, M.: Using additive expert ensembles to cope with concept drift. In: ICML, pp. 449–456. ACM, New York (2005)

Kolter J., Maloof M.: Dynamic weighted majority: an ensemble method for drifting concepts. JMLR 8, 2755–2790 (2007)

Kubat M.: Floating approximation in time-varying knowledge bases. PRL 10(4), 223–227 (1989)

Kuncheva L.I., Whitaker C.J.: Measures of diversity in classifier ensembles. Mach. Learn. 51, 181–207 (2003)

Lange S., Grieser G.: On the power of incremental learning. TCS 288(2), 277–307 (2002)

Lange S., Zilles, S.: Formal models of incremental learning and their analysis. In: IJCNN, vol. 4, pp. 2691–2696. IEEE, New York (2003)

Last M.: Online classification of nonstationary data streams. IDA 6(2), 129–147 (2002)

Lazarescu M., Venkatesh S., Bui H.: Using multiple windows to track concept drift. IDA 8(1), 29–59 (2004)

Lichtenwalter, R., Chawla, N.V.: Adaptive methods for classification in arbitrarily imbalanced and drifting data streams. In: New Frontiers in Applied Data Mining. Lecture Notes in Computer Science, vol. 5669, pp. 53–75. Springer, Berlin (2010)

Maron, O., Moore, A.W.: Hoeffding races: accelerating model selection search for classification and function approximation. In: NIPS, pp. 59–66 (1993)

Masnadi-Shirazi H., Vasconcelos N.: Cost-sensitive boosting. PAMI 33(2), 294–309 (2011). doi: 10.1109/TPAMI.2010.71

Mitchell T., Caruana R., Freitag D., McDermott J., Zabowski D.: Experience with a learning personal assistant. Commun. ACM 37(7), 80–91 (1994)

Moreno-Torres, J., Herrera, F.: A preliminary study on overlapping and data fracture in imbalanced domains by means of genetic programming-based feature extraction. In: ISDA, pp. 501 –506 (2010). doi: 10.1109/ISDA.2010.5687214

Moreno-Torres, J., Raeder, T., Alaiz-Rodríguez, R., Chawla, N.V., Herrera, F.: A unifying view on dataset shift in classification. Pattern Recognit. 45, 521–530 (2011)

Muhlbaier, M., Polikar, R.: An ensemble approach for incremental learning in nonstationary environments. In: MCS, pp. 490–500 (2007)

Muhlbaier, M., Polikar, R.: Multiple classifiers based incremental learning algorithm for learning in nonstationary environments. In: ICMLC, vol. 6, pp. 3618–3623. IEEE, New York (2007)

Muhlbaier M., Topalis A., Polikar R.: Learn++. nc: combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes. TNN 20(1), 152–168 (2009). doi: 10.1109/TNN.2008.2008326

Nishida, K., Yamauchi, K., Omori, T.: Ace: adaptive classifiers-ensemble system for concept-drifting environments. In: MCS, pp. 176–185 (2005)

Pfahringer, B., Holmes, G., Kirkby, R.: New options for hoeffding trees. In: AAI, pp. 90–99 (2007)

Polikar R.: Ensemble based systems in decision making. Circuits Syst. Mag. 6(3), 21–45 (2006)

Polikar R.: Bootstrap-inspired techniques in computation intelligence. Signal Process. Mag. 24(4), 59–72 (2007)

Polikar, R., Upda, L., Upda, S.S., Honavar, V.: Learn++: an incremental learning algorithm for supervised neural networks. In: SMC Part C, pp. 497–508 (2001)

Quinlan, J.: C4.5: Programs For Machine Learning. Morgan Kaufmann, Menlo Park (1993)

Schapire R., Singer Y.: Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37(3), 297–336 (1999)

Scholz M., Klinkenberg R.: Boosting classifiers for drifting concepts. IDA 11(1), 3–28 (2007)

Stanley, K.: Learning concept drift with a committee of decision trees. Technical Report AI-03-302, Computer Science Department, University of Texas-Austin (2003)

Street, W., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: KDD, pp. 377–382. ACM, New York (2001)

Ting, K.: A comparative study of cost-sensitive boosting algorithms. In: ICML (Citeseer) (2000)

Tsymbal, A.: The problem of concept drift: definitions and related work. Technical Report TCD-CS-2004-15, Departament of Computer Science, Trinity College (2004). https://www.cs.tcd.ie/publications/techreports/reports

Tsymbal, A., Pechenizkiy, M., Cunningham, P., Puuronen, S.: Handling local concept drift with dynamic integration of classifiers: domain of antibiotic resistance in nosocomial infections. In: CBMS, pp. 679 –684 (2006). doi: 10.1109/CBMS.2006.94

Tsymbal A., Pechenizkiy M., Cunningham P., Puuronen S.: Dynamic integration of classifiers for handling concept drift. Inf. Fusion 9(1), 56–68 (2008)

Wang, H., Fan, W., Yu, P., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: KDD, pp. 226–235. ACM, New York (2003)

Wang, H., Yin, J., Pei, J., Yu, P., Yu, J.: Suppressing model overfitting in mining concept-drifting data streams. In: KDD, pp. 736–741. ACM, New York (2006)

Widmer, G., Kubat, M.: Learning flexible concepts from streams of examples: Flora2. In: ECAI, p. 467. Wiley, New York (1992)

Widmer, G., Kubat, M.: Effective learning in dynamic environments by explicit context tracking. In: ECML, pp. 227–243. Springer, Berlin (1993)

Widmer G., Kubat M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA