An overview of multi-task learning

National Science Review - Tập 5 Số 1 - Trang 30-43 - 2018
Yu Zhang1, Qiang Yang1
1Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China

Tóm tắt

Abstract

As a promising area in machine learning, multi-task learning (MTL) aims to improve the performance of multiple related learning tasks by leveraging useful information among them. In this paper, we give an overview of MTL by first giving a definition of MTL. Then several different settings of MTL are introduced, including multi-task supervised learning, multi-task unsupervised learning, multi-task semi-supervised learning, multi-task active learning, multi-task reinforcement learning, multi-task online learning and multi-task multi-view learning. For each setting, representative MTL models are presented. In order to speed up the learning process, parallel and distributed MTL models are introduced. Many areas, including computer vision, bioinformatics, health informatics, speech, natural language processing, web applications and ubiquitous computing, use MTL to improve the performance of the applications involved and some representative works are reviewed. Finally, recent theoretical analyses for MTL are presented.

Từ khóa


Tài liệu tham khảo

Caruana, 1997, Multitask learning, Mach Learn, 28, 41, 10.1023/A:1007379606734

Pan, 2010, A survey on transfer learning, IEEE Trans Knowl Data Eng, 22, 1345, 10.1109/TKDE.2009.191

Zhang, 2014, A review on multi-label learning algorithms, IEEE Trans Knowl Data Eng, 26, 1819, 10.1109/TKDE.2013.39

Zhang, A survey on multi-task learning, 10.1109/TKDE.2021.3070203

Argyriou, 2006, Multi-task feature learning, Advances in Neural Information Processing Systems 19, 41

Argyriou, 2008, Convex multi-task feature learning, Mach Learn, 73, 243, 10.1007/s10994-007-5040-8

Maurer, 2013, Sparse coding for multitask and transfer learning, Proceedings of the 30th International Conference on Machine Learning, 343

Obozinski, 2006, Multi-task feature selection, Ph.D. Thesis

Obozinski, 2010, Joint covariate selection and joint subspace selection for multiple classification problems, Stat Comput, 20, 231, 10.1007/s11222-008-9111-x

Liu, 2009, Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery, Proceedings of the 26th International Conference on Machine Learning, 649

Gong, 2013, Multi-stage multi-task feature learning, J Mach Learn Res, 14, 2979

Lozano, 2012, Multi-level lasso for sparse multi-task regression, Proceedings of the 29th International Conference on Machine Learning

Wang, 2014, On multiplicative multitask feature learning, Advances in Neural Information Processing Systems 27, 2411

Han, 2014, Encoding tree sparsity in multi-task learning: a probabilistic framework, Proceedings of the 28th AAAI Conference on Artificial Intelligence, 1854

Zhang, 2010, Probabilistic multi-task feature selection, Advances in Neural Information Processing Systems 23, 2559

Hernández-Lobato, 2013, Learning feature selection dependencies in multi-task learning, Advances in Neural Information Processing Systems 26, 746

Hernández-Lobato, 2015, A probabilistic model for dirty multi-task feature selection, Proceedings of the 32nd International Conference on Machine Learning, 1073

Zhang, 2014, Facial landmark detection by deep multi-task learning, Proceedings of the 13th European Conference on Computer Vision, 94

Liu, 2015, Multi-task deep visual-semantic embedding for video thumbnail selection, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 3707

Zhang, 2015, Deep model based transfer and multi-task learning for biological image analysis, Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1475, 10.1145/2783258.2783304

Mrksic, 2015, Multi-domain dialog state tracking using recurrent neural networks, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, 794

Li, 2015, Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network, Int J Comput Vis, 113, 19, 10.1007/s11263-014-0767-8

Misra, 2016, Cross-stitch networks for multi-task learning, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 3994

Ando, 2005, A framework for learning predictive structures from multiple tasks and unlabeled data, J Mach Learn Res, 6, 1817

Chen, 2009, A convex formulation for learning shared structures from multiple tasks, Proceedings of the 26th International Conference on Machine Learning, 137

Pong, 2010, Trace norm regularization: reformulations, algorithms, and multi-task learning, SIAM J Optim, 20, 3465, 10.1137/090763184

Han, 2016, Multi-stage multi-task learning with reduced rank, Proceedings of the 30th AAAI Conference on Artificial Intelligence, 10.1609/aaai.v30i1.10261

Thrun, 1996, Discovering structure in multiple learning tasks: the TC algorithm, Proceedings of the 13th International Conference on Machine Learning, 489

Bakker, 2003, Task clustering and gating for Bayesian multitask learning, J Mach Learn Res, 4, 83

Xue, 2007, Multi-task learning for classification with Dirichlet process priors, J Mach Learn Res, 8, 35

Jacob, 2008, Clustered multi-task learning: a convex formulation, Advances in Neural Information Processing Systems 21, 745

Kang, 2011, Learning with whom to share in multi-task feature learning, Proceedings of the 28th International Conference on Machine Learning, 521

Kumar, 2012, Learning task grouping and overlap in multi-task learning, Proceedings of the 29th International Conference on Machine Learning

Han, 2015, Learning multi-level task groups in multi-task learning, Proceedings of the 29th AAAI Conference on Artificial Intelligence, 10.1609/aaai.v29i1.9581

Barzilai, 2015, Convex multi-task learning by clustering, Proceedings of the 18th International Conference on Artificial Intelligence and Statistics

Evgeniou, 2004, Regularized multi-task learning, Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 109

Parameswaran, 2010, Large margin multi-task metric learning, Advances in Neural Information Processing Systems 23, 1867

Evgeniou, 2005, Learning multiple tasks with kernel methods, J Mach Learn Res, 6, 615

Kato, 2007, Multi-task learning via conic programming, Advances in Neural Information Processing Systems 20, 737

Kato, 2010, Conic programming for multitask learning, IEEE Trans Knowl Data Eng, 22, 957, 10.1109/TKDE.2009.142

Görnitz, 2011, Hierarchical multitask structured output learning for large-scale sequence segmentation, Advances in Neural Information Processing Systems 24, 2690

Bonilla, 2007, Multi-task Gaussian process prediction, Advances in Neural Information Processing Systems 20, 153

Zhang, 2010, Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, 964

Zhang, 2010, A convex formulation for learning task relationships in multi-task learning, Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, 733

Zhang, 2014, A regularization approach to learning task relationships in multitask learning, ACM Trans Knowl Discov Data, 8, 12, 10.1145/2538028

Zhang, 2012, Multi-task boosting by exploiting task relationships, Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 697

Zhang, 2013, Multilabel relationship learning, ACM Trans Knowl Discov Data, 7, 7, 10.1145/2499907.2499910

Zhang, 2017, Learning sparse task relations in multi-task learning, Proceedings of the 31st AAAI Conference on Artificial Intelligence, 10.1609/aaai.v31i1.10820

Zhang, 2010, Learning multiple tasks with a sparse matrix-normal penalty, Advances in Neural Information Processing Systems 23, 2550

Zhang, 2013, Learning high-order task relationships in multi-task learning, Proceedings of the 23rd International Joint Conference on Artificial Intelligence

Lee, 2016, Asymmetric multi-task learning based on task relatedness and loss, Proceedings of the 33rd International Conference on Machine Learning, 230

Zhang, 2013, Heterogeneous-neighborhood-based multi-task local learning algorithms, Advances in Neural Information Processing Systems 26

Jalali, 2010, A dirty model for multi-task learning, Advances in Neural Information Processing Systems 23, 964

Chen, 2010, Learning incoherent sparse and low-rank patterns from multiple tasks, Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1179, 10.1145/1835804.1835952

Chen, 2011, Integrating low-rank and group-sparse structures for robust multi-task learning, Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 42, 10.1145/2020408.2020423

Gong, 2012, Robust multi-task feature learning, Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 895, 10.1145/2339530.2339672

Zhong, 2012, Convex multitask learning with flexible task clusters, Proceedings of the 29th International Conference on Machine Learning

Jawanpuria, 2012, A convex feature learning formulation for latent task structure discovery, Proceedings of the 29th International Conference on Machine Learning

Zweig, 2013, Hierarchical regularization cascade for joint learning, Proceedings of the 30th International Conference on Machine Learning, 37

Han, 2015, Learning tree structure in multi-task learning, Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 10.1145/2783258.2783393

Bickel, 2008, Multi-task learning for HIV therapy screening, Proceedings of the Twenty-Fifth International Conference on Machine Learning, 56, 10.1145/1390156.1390164

Zhang, 2015, Convex discriminative multitask clustering, IEEE Trans Pattern Anal Mach Intell, 37, 28, 10.1109/TPAMI.2014.2343221

Liu, 2007, Semi-supervised multitask learning, Advances in Neural Information Processing Systems 20, 937

Liu, 2009, Semisupervised multitask learning, IEEE Trans Pattern Anal Mach Intell, 31, 1074, 10.1109/TPAMI.2008.296

Zhang, 2009, Semi-supervised multi-task regression, Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, 617, 10.1007/978-3-642-04174-7_40

Reichart, 2008, Multi-task active learning for linguistic annotations, Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, 861

Acharya, 2014, Active multitask learning using both latent and supervised shared topics, Proceedings of the 2014 SIAM International Conference on Data Mining, 190, 10.1137/1.9781611973440.22

Fang, 2015, Active multi-task learning via bandits, Proceedings of the 2015 SIAM International Conference on Data Mining, 505, 10.1137/1.9781611974010.57

Wilson, 2007, Multi-task reinforcement learning: a hierarchical Bayesian approach, Proceedings of the Twenty-Fourth International Conference on Machine Learning, 1015, 10.1145/1273496.1273624

Li, 2009, Multi-task reinforcement learning in partially observable stochastic environments, J Mach Learn Res, 10, 1131

Lazaric, 2010, Bayesian multi-task reinforcement learning, Proceedings of the 27th International Conference on Machine Learning, 599

Calandriello, 2014, Sparse multi-task reinforcement learning, Advances in Neural Information Processing Systems 27, 819

Parisotto, 2016, Actor-mimic: deep multitask and transfer reinforcement learning, Proceedings of the 4th International Conference on Learning Representations

Dekel, 2006, Online multitask learning, Proceedings of the 19th Annual Conference on Learning Theory, 453

Dekel, 2007, Online learning of multiple tasks with a shared loss, J Mach Learn Res, 8, 2233

Lugosi, 2009, Online multi-task learning with hard constraints, Proceedings of the 22nd Conference on Learning Theory

Cavallanti, 2010, Linear algorithms for online multitask classification, J Mach Learn Res, 11, 2901

Pillonetto, 2010, Bayesian online multitask learning of Gaussian processes, IEEE Trans Pattern Anal Mach Intell, 32, 193, 10.1109/TPAMI.2008.297

Saha, 2011, Online learning of multiple tasks and their relationships, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 643

He, 2011, A graph-based framework for multi-task multi-view learning, Proceedings of the 28th International Conference on Machine Learning, 25

Zhang, 2012, Inductive multi-task learning with multiple view data, Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 543, 10.1145/2339530.2339617

Zhang, 2015, Parallel multi-task learning, Proceedings of the IEEE International Conference on Data Mining, 10.1109/ICDM.2015.130

Wang, 2016, Distributed multi-task learning, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 751

Wang, 2009, Boosted multi-task learning for face verification with applications to web image and video search, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 142

Zhang, 2010, Multi-task warped Gaussian process for personalized age estimation, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition

Yuan, 2010, Visual classification with multi-task joint sparse representation, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 3493

Yan, 2013, No matter where you are: flexible graph-guided multi-task learning for multi-view head pose classification under target motion, Proceedings of IEEE International Conference on Computer Vision, 1177

Yim, 2015, Rotating your face using multi-task deep neural network, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 676

An, 2008, Hierarchical kernel stick-breaking process for multi-task image analysis, Proceedings of the 25th International Conference on Machine Learning, 17, 10.1145/1390156.1390159

Cheng, 2011, Multi-task low-rank affinity pursuit for image segmentation, Proceedings of IEEE International Conference on Computer Vision, 2439

Wang, 2011, Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance, Proceedings of IEEE International Conference on Computer Vision, 557

Lang, 2012, Saliency detection by multitask sparsity pursuit, IEEE Trans Image Process, 21, 1327, 10.1109/TIP.2011.2169274

Yuan, 2013, Multi-task sparse learning with beta process prior for action recognition, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 423

Lapin, 2014, Scalable multitask representation learning for scene classification, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1434

Abdulnabi, 2015, Multi-task CNN model for attribute prediction, IEEE Trans Multimed, 17, 1949, 10.1109/TMM.2015.2477680

Su, 2015, Multi-task learning with low rank attribute embedding for person re-identification, Proceedings of IEEE International Conference on Computer Vision, 3739

Chu, 2015, Multi-task recurrent neural network for immediacy prediction, Proceedings of IEEE International Conference on Computer Vision, 3352

Zhang, 2012, Robust visual tracking via multi-task sparse learning, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2042

Zhang, 2013, Robust visual tracking via structured multi-task sparse learning, Int J Comput Vis, 101, 367, 10.1007/s11263-012-0582-z

Hong, 2013, Tracking via robust multi-task multi-view joint sparse representation, Proceedings of IEEE International Conference on Computer Vision, 649

Widmer, 2010, Leveraging sequence classification by taxonomy-based multitask learning, Proceedings of the 14th Annual International Conference on Research in Computational Molecular Biology, 522, 10.1007/978-3-642-12683-3_34

Zhang, 2010, Sparse multitask regression for identifying common mechanism of response to therapeutic targets, Bioinformatics, 26, 97, 10.1093/bioinformatics/btq181

Liu, 2010, Multi-task learning for cross-platform siRNA efficacy prediction: an in-silico study, BMC Bioinformatics, 11, 181, 10.1186/1471-2105-11-181

Puniyani, 2010, Multi-population GWA mapping via multi-task regularized regression, Bioinformatics, 26, 208, 10.1093/bioinformatics/btq191

Alamgir, 2010, Multitask learning for brain-computer interfaces, Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, 17

Widmer, 2010, Inferring latent task structure for multitask learning by multiple kernel learning, BMC Bioinformatics, 11, S5, 10.1186/1471-2105-11-S8-S5

Xu, 2011, Multitask learning for protein subcellular location prediction, IEEE ACM Trans Comput Biol Bioinformatics, 8, 748, 10.1109/TCBB.2010.22

Zhou, 2011, A multi-task learning formulation for predicting disease progression, Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 814, 10.1145/2020408.2020549

Wan, 2012, Sparse Bayesian multi-task learning for predicting cognitive outcomes from neuroimaging measures in Alzheimer’s disease, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 940

Wang, 2012, High-order multi-task feature learning to identify longitudinal phenotypic markers for alzheimer’s disease progression prediction, Advances in Neural Information Processing Systems 25, 1286

Mordelet, 2011, ProDiGe: Prioritization of disease genes with multitask machine learning from positive and unlabeled examples, BMC Bioinformatics, 12, 389, 10.1186/1471-2105-12-389

Li, 2016, A multi-task learning formulation for survival analysis, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1715, 10.1145/2939672.2939857

He, 2016, Novel applications of multitask learning and multiple output regression to multiple genetic trait prediction, Bioinformatics, 32, 37, 10.1093/bioinformatics/btw249

Wu, 2015, Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis, Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, 4460

Hu, 2015, Fusion of multiple parameterisations for DNN-based sinusoidal speech synthesis with multi-task learning, Proceedings of the 16th Annual Conference of the International Speech Communication Association, 854

Collobert, 2008, A unified architecture for natural language processing: deep neural networks with multitask learning, Proceedings of the 25th International Conference on Machine Learning, 160, 10.1145/1390156.1390177

Wu, 2015, Collaborative multi-domain sentiment classification, Proceedings of the 2015 IEEE International Conference on Data Mining, 459, 10.1109/ICDM.2015.68

Luong, 2016, Multi-task sequence to sequence learning, Proceedings of the 4th International Conference on Learning Representations

Zhao, 2015, Multi-task learning for spatio-temporal event forecasting, Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1503, 10.1145/2783258.2783377

Zhao, 2017, Feature constrained multi-task learning models for spatiotemporal event forecasting, IEEE Trans Knowl Data Eng, 29, 1059, 10.1109/TKDE.2017.2657624

Bai, 2009, Multi-task learning for learning to rank in web search, Proceedings of the 18th ACM Conference on Information and Knowledge Management, 1549, 10.1145/1645953.1646169

Chapelle, 2010, Multi-task learning for boosting with application to web search ranking, Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1189, 10.1145/1835804.1835953

Zhang, 2010, Multi-domain collaborative filtering, Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, 725

Ahmed, 2012, Web-scale multi-task feature selection for behavioral targeting, Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 1737, 10.1145/2396761.2398508

Ahmed, 2014, Scalable hierarchical multitask learning algorithms for conversion optimization in display advertising, Proceedings of the 7th ACM International Conference on Web Search and Data Mining, 153, 10.1145/2556195.2556264

Ghosn, 1996, Multi-task learning for stock selection, Advances in Neural Information Processing Systems 9, 946

Zheng, 2008, Transferring multi-device localization models using latent multi-task learning, Proceedings of the 23rd AAAI Conference on Artificial Intelligence, 1427

Chai, 2008, Multi-task Gaussian process learning of robot inverse dynamics, Advances in Neural Information Processing Systems 21, December 8-11, 2008, 265

Yeung, 2009, Learning inverse dynamics by Gaussian process regression under the multi-task learning framework, The Path to Autonomous Robots, 131, 10.1007/978-0-387-85774-9_8

Zheng, 2013, Time-dependent trajectory regression on road networks via multi-task learning, Proceedings of the 27th AAAI Conference on Artificial Intelligence, 10.1609/aaai.v27i1.8577

Huang, 2014, Robust dynamic trajectory regression on road networks: a multi-task learning framework, Proceedings of IEEE International Conference on Data Mining, 857

Lu, 2017, Traffic sign recognition via multi-modal tree-structure embedded multi-task learning, IEEE Trans Intell Transport Syst, 18, 960, 10.1109/TITS.2016.2598356

Baxter, 2000, A model of inductive bias learning, J Artif Intell Res, 12, 149, 10.1613/jair.731

Maurer, 2006, Bounds for linear multi-task learning, J Mach Learn Res, 7, 117

Kakade, 2012, Regularization techniques for learning with matrices, J Mach Learn Res, 13, 1865

Maurer, 2006, The Rademacher complexity of linear transformation classes, Proceedings of the 19th Annual Conference on Learning Theory, 65

Pontil, 2013, Excess risk bounds for multitask learning with trace norm regularization, Proceedings of the 26th Annual Conference on Learning Theory, 55

Zhang, 2015, Multi-task learning and algorithmic stability, Proceedings of the 29th AAAI Conference on Artificial Intelligence, 10.1609/aaai.v29i1.9558