Open-environment machine learning

National Science Review - Tập 9 Số 8 - 2022
Zhi‐Hua Zhou1
1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China

Tóm tắt

Abstract

Conventional machine learning studies generally assume close-environment scenarios where important factors of the learning process hold invariant. With the great success of machine learning, nowadays, more and more practical tasks, particularly those involving open-environment scenarios where important factors are subject to change, called open-environment machine learning in this article, are present to the community. Evidently, it is a grand challenge for machine learning turning from close environment to open environment. It becomes even more challenging since, in various big data tasks, data are usually accumulated with time, like streams, while it is hard to train the machine learning model after collecting all data as in conventional studies. This article briefly introduces some advances in this line of research, focusing on techniques concerning emerging new classes, decremental/incremental features, changing data distributions and varied learning objectives, and discusses some theoretical issues.

Từ khóa


Tài liệu tham khảo

Parmar, Open-world machine learning: applications, challenges, and opportunities, 10.1145/3561381

Sehwag, 2019, Analyzing the robustness of open-world machine learning, Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, 105, 10.1145/3338501.3357372

Pfülb, 2019, A comprehensive, application-oriented study of catastrophic forgetting in DNNs, 7th International Conference on Learning Representations (ICLR)

Delange, 2022, A continual learning survey: defying forgetting in classification tasks, IEEE Trans Pattern Anal Mach Intell

Zhou, 2018, A brief introduction to weakly supervised learning, Natl Sci Rev, 5, 44, 10.1093/nsr/nwx106

Da, 2014, Learning with augmented class by exploiting unlabeled data, Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI), 1760

Zhang, 2020, An unbiased risk estimator for learning with augmented classes, Advances in Neural Information Processing Systems 33 (NeurIPS), 10247

Socher, 2013, Zero-shot learning through cross-modal transfer, Advances in Neural Information Processing Systems 26 (NIPS), 935

Xian, 2019, Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly, IEEE Trans Pattern Anal Mach Intell, 41, 2251, 10.1109/TPAMI.2018.2857768

Chen, 2021, Knowledge-aware zero-shot learning: survey and perspective, Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI), 4366

Pan, 2009, A survey on transfer learning, IEEE Trans Knowl Data Eng, 22, 1345, 10.1109/TKDE.2009.191

Fumera, 2000, Reject option with multiple thresholds, Pattern Recognit, 33, 2099, 10.1016/S0031-3203(00)00059-5

Bartlett, 2008, Classification with a reject option using a hinge loss, J Mach Learn Res, 9, 1823

Geifman, 2019, SelectiveNet: a deep neural network with an integrated reject option, Proceedings of the 36th International Conference on Machine Learning (ICML), 2151

Scheirer, 2013, Towards open set recognition, IEEE Trans Pattern Anal Mach Intell, 35, 1757, 10.1109/TPAMI.2012.256

Geng, 2020, Recent advances in open set recognition: a survey, IEEE Trans Pattern Anal Mach Intell, 43, 3614, 10.1109/TPAMI.2020.2981604

Utgoff, 1989, Incremental induction of decision trees, Mach Learn, 4, 161, 10.1023/A:1022699900025

Syed, 1999, Incremental learning with support vector machines, Proceedings of the Workshop on Support Vector Machines at the International Joint Conference on Articial Intelligence (IJCAI-99)

Giraud-Carrier, 2000, A note on the utility of incremental learning, AI Commun, 13, 215

He, 2011, Incremental learning from stream data, IEEE Trans Neural Netw, 22, 1901, 10.1109/TNN.2011.2171713

Zhou, 2002, Hybrid decision tree, Knowl Based Syst, 15, 515, 10.1016/S0950-7051(02)00038-2

Ozawa, 2005, Incremental learning of feature space and classifier for face recognition, Neural Netw, 18, 575, 10.1016/j.neunet.2005.06.016

Zhou, 2012, Online incremental feature learning with denoising autoencoders, Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 1453

Masana, Class-incremental learning: survey and performance evaluation on image classification, 10.1109/TPAMI.2022.3213473

Golub, 1999, Molecular classification of cancer: class discovery and class prediction by gene expression, Science, 286, 531, 10.1126/science.286.5439.531

Monti, 2003, Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data, Mach Learn, 52, 91, 10.1023/A:1023949509487

Mu, 2017, Classification under streaming emerging new classes: a solution using completely-random trees, IEEE Trans Knowl Data Eng, 29, 1605, 10.1109/TKDE.2017.2691702

Liu, 2008, Isolation forest, Proceedings of the 8th IEEE International Conference on Data Mining (ICDM), 413

Mu, 2017, Streaming classification with emerging new class by class matrix sketching, Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI), 2373

Zhu, 2017, Discover multiple novel labels in multi-instance multi-label learning, Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI), 2977

Zhu, 2017, New class adaptation via instance generation in one-pass class incremental learning, International Conference on Data Mining (ICDM), 1207

Zhu, 2018, Multi-label learning with emerging new labels, IEEE Trans Knowl Data Eng, 30, 1901, 10.1109/TKDE.2018.2810872

Faria, 2015, Evaluation of multiclass novelty detection algorithms for data streams, IEEE Trans Knowl Data Eng, 27, 2961, 10.1109/TKDE.2015.2441713

Zhao, 2021, Exploratory machine learning with unknown unknowns, Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI), 10999, 10.1609/aaai.v35i12.17313

Hou, 2018, One-pass learning with incremental and decremental features, IEEE Trans Pattern Anal Mach Intell, 40, 2776, 10.1109/TPAMI.2017.2769047

Hou, 2017, Learning with feature evolvable streams, Advances in Neural Information Processing Systems 30 (NIPS), 1417

Zhou, 2012, Ensemble Methods: Foundations and Algorithms, 10.1201/b12207

Hou, 2021, Prediction with unpredictable feature evolution, IEEE Trans Neural Netw Learn Sys, 10.1109/TNNLS.2021.3071311

Zhang, 2020, Learning with feature and distribution evolvable streams, Proceedings of the 37th International Conference on Machine Learning (ICML), 11317

Hu, 2019, A novel feature incremental learning method for sensor-based activity recognition, IEEE Trans Knowl Data Eng, 31, 1038, 10.1109/TKDE.2018.2855159

Sugiyama, 2012, Machine Learning in Non-Stationary Environments: Introduction to Covariate Shift Adaptation, 10.7551/mitpress/9780262017091.001.0001

Gama, 2014, A survey on concept drift adaptation, ACM Comput Surv, 46, 44, 10.1145/2523813

III, 2006, Domain adaptation for statistical classifiers, J Artif Intell Res, 26, 101, 10.1613/jair.1872

Ben-David, 2010, Impossibility theorems for domain adaptation, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 129

Kouw, 2005, A review of domain adaptation without target labels, IEEE Trans Pattern Anal Mach Intell, 43, 766, 10.1109/TPAMI.2019.2945942

Klinkenberg, 2000, Detecting concept drift with support vector machines, Proceedings of the 17th International Conference on Machine Learning (ICML)., 487

Kuncheva, 2009, On the window size for classification in changing environments, Intell Data Anal, 13, 861, 10.3233/IDA-2009-0397

Koychev, 2000, Gradual forgetting for adaptation to concept drift, Proceedings of ECAI 2000 Workshop Current Issues in Spatio-Temporal Reasoning, 101

Anagnostopoulos, 2012, Online linear and quadratic discriminant analysis with adaptive forgetting for streaming classification, Stat Anal Data Min, 5, 139, 10.1002/sam.10151

Gomes, 2017, A survey on ensemble learning for data stream classification, ACM Comput Surv, 50, 23

Zhao, 2021, Distribution-free one-pass learning, IEEE Trans Knowl Data Eng, 33, 951

Guo, 1993, Performance analysis of the forgetting factor RLS algorithm, Int J Adapt Control Signal Process, 7, 525, 10.1002/acs.4480070604

Foulds, 2010, A review of multi-instance learning assumptions, Knowl Eng Rev, 25, 1, 10.1017/S026988890999035X

Zhang, 2017, Multi-instance learning with key instance shift, Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), 3441

Cortes, 2004, AUC optimization vs. error rate minimization, Advances in Neural Information Processing Systems 16 (NIPS), 313

Wu, 2017, A unified view of multi-label performance measures, Proceedings of the 34th International Conference on Machine Learning (ICML), 70, 3780

Li, 2013, Efficient optimization of performance measures by classifier adaptation, IEEE Trans Pattern Anal Mach Intell, 35, 1370, 10.1109/TPAMI.2012.172

Zhao, 2020, Handling concept drift via model reuse, Mach Learn, 109, 533, 10.1007/s10994-019-05835-w

Wu, 2019, Heterogeneous model reuse via optimizing multiparty multiclass margin, Proceedings of the 36th International Conference on Machine Learning (ICML), 6840

Zhou, 2019, Evolutionary Learning: Advances in Theories and Algorithms, 10.1007/978-981-13-5956-9

Ding, 2018, Preference based adaptation for learning objectives, Advances in Neural Information Processing Systems 31 (NeurIPS), 7839

Zhang, 2010, Understanding bag-of-words model: a statistical framework, Int J Mach Learn Cybern, 1, 43, 10.1007/s13042-010-0001-0

Liu, 2022, PAC guarantees and effective algorithms for detecting novel categories, J Mach Learn Res, 23, 1

Helmbold, 1994, Tracking drifting concepts by minimizing disagreements, Mach Learn, 14, 27, 10.1007/BF00993161

Crammer, 2010, Regret minimization with concept drift, Proceedings of the 23rd Conference on Learning Theory (COLT), 168

Mohri, 2012, New analysis and algorithm for learning with drifting distributions, Proceedings of the 23rd International Conference on Algorithmic Learning Theory (ALT), 124, 10.1007/978-3-642-34106-9_13

Kolter, 2005, Using additive expert ensembles to cope with concept drift, Proceedings of the 22nd International Conference on Machine Learning (ICML), 449, 10.1145/1102351.1102408

Harel, 2014, Concept drift detection through resampling, Proceedings of the 31st International Conference on Machine Learning (ICML), 1009

Mohri, 2008, Rademacher complexity bounds for non-i.i.d. processes, Advances in Neural Information Processing Systems 21 (NIPS), 1097

Pentina, 2015, Lifelong learning with non-i.i.d. tasks, Advances in Neural Information Processing Systems 28 (NIPS), 1540

Gao, 2016, Learnability of non-i.i.d, Proceedings of the 8th Asian Conference on Machine Learning (ACML), 158

Sutton, 2012, Reinforcement Learning: An Introduction, 2nd edn.

Majid, Deep reinforcement learning versus evolution strategies: a comparative survey, 10.1109/TNNLS.2023.3264540

Zhang, 2018, Learning environmental calibration actions for policy self-evolution, Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), 3061

Chen, 2018, Stabilizing reinforcement learning in dynamic environment with application to online recommendation, Proceedings of the 24th {ACM} {SIGKDD} International Conference on Knowledge Discovery & Data Mining (KDD), 1187, 10.1145/3219819.3220122

Li, 2015, Towards making unlabeled data never hurt, IEEE Trans Pattern Anal Mach Intell, 37, 175, 10.1109/TPAMI.2014.2299812

Catoni, 2012, Challenging the empirical mean and empirical variance: a deviation study, Ann Inst Henri Poincare, 48, 1148

Zhang, 2018, L1-regression with heavy-tailed distributions, Advances in Neural Information Processing Systems 31 (NeurIPS), 1084

Cesa-Bianchi, 2006, Prediction, Learning, and Games, 10.1017/CBO9780511546921

Shalev-Shwartz, 2011, Online learning and online convex optimization, Found Trends Mach Learn, 4, 107, 10.1561/2200000018

Zinkevich, 2003, Online convex programming and generalized infinitesimal gradient ascent, Proceedings of the 20th International Conference on Machine Learning (ICML), 928

Zhang, 2018, Adaptive online learning in dynamic environments, Advances in Neural Information Processing Systems 31 (NeurIPS), 1330

Zhao, 2020, Dynamic regret of convex and smooth functions, Advances in Neural Information Processing Systems 33 (NeurIPS), 12510

Zhao, 2022, Non-stationary online learning with memory and non-stochastic control, Proceedings of the 25th International Conference on Artificial Intelligence and Statistics (AISTATS), 2101

Zhao, 2021, Bandit convex optimization in non-stationary environments, J Mach Learn Res, 22, 1

Zhao, 2020, A simple approach for non-stationary linear bandits, Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), 746

Wei, 2021, Non-stationary reinforcement learning without prior knowledge: an optimal black-box approach, Proceedings of 34th Conference on Learning Theory (COLT), 4300

Angluin, 1988, Learning from noisy examples, Mach Learn, 2, 343, 10.1007/BF00116829

Blum, 2003, Noise-tolerant learning, the parity problem, and the statistical query model, J ACM, 50, 506, 10.1145/792538.792543

Natarajan, 2013, Learning with noisy labels, Advances in Neural Information Processing Systems 26 (NIPS), 1196

Gao, 2016, Risk minimization in the presence of label noise, Proceedings of the 13th AAAI Conference on Artificial Intelligence, 1575

Gao, 2021, On the noise estimation statistics, Artif Intell, 293, 103451, 10.1016/j.artint.2021.103451

Dietterich, 2017, Steps toward robust artificial intelligence, AI Mag, 38, 3

Dietterich, 2019, Robust artificial intelligence and robust human organizations, Front Comput Sci, 13, 1, 10.1007/s11704-018-8900-4

Zhou, 2016, Learnware: on the future of machine learning, Front Comput Sci, 10, 589, 10.1007/s11704-016-6906-3