Diversified branch fusion for self-knowledge distillation

Information Fusion - Tập 90 - Trang 12-22 - 2023
Zuxiang Long1, Fuyan Ma1, Bin Sun1, Mingkui Tan2, Shutao Li1
1College of Electrical and Information Engineering, Hunan University, Changsha 430072, China
2School of Software Engineering, South China University of Technology, Guangzhou 510641, China

Tài liệu tham khảo

Feng, 2021, Resolution-aware knowledge distillation for efficient inference, IEEE Trans. Image Process., 30, 6985, 10.1109/TIP.2021.3101158 Zhang, 2021, Student network learning via evolutionary knowledge distillation, IEEE Trans. Circuits Syst. Video Technol., 32, 2251, 10.1109/TCSVT.2021.3090902 Hinton, 2015 B. Heo, M. Lee, S. Yun, J.Y. Choi, Knowledge transfer via distillation of activation boundaries formed by hidden neurons, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 2019, pp. 3779–3787. D. Chen, J.-P. Mei, Y. Zhang, C. Wang, Z. Wang, Y. Feng, C. Chen, Cross-layer distillation with semantic calibration, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, 2021, pp. 7028–7036. Y. Zhang, T. Xiang, T.M. Hospedales, H. Lu, Deep mutual learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4320–4328. A. Yao, D. Sun, Knowledge transfer via dense cross-layer mutual-distillation, in: Proceedings of the European Conference on Computer Vision (ECCV), 2020, pp. 294–311. Q. Guo, X. Wang, Y. Wu, Z. Yu, D. Liang, X. Hu, P. Luo, Online knowledge distillation via collaborative learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 11020–11029. Zhang, 2021, Self-distillation: Towards efficient and compact neural networks, IEEE Trans. Pattern Anal. Mach. Intell., 10.1109/TPAMI.2021.3067100 L. Zhang, J. Song, A. Gao, J. Chen, C. Bao, K. Ma, Be your own teacher: Improve the performance of convolutional neural networks via self distillation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 3713–3722. M. Phuong, C.H. Lampert, Distillation-based training for multi-exit architectures, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019, pp. 1355–1364. X. Wang, Y. Li, Harmonized dense knowledge distillation training for multi-exit architectures, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, 2021, pp. 10218–10226. D. Chen, J.-P. Mei, C. Wang, Y. Feng, C. Chen, Online knowledge distillation with diverse peers, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, 2020, pp. 3430–3437. Kuncheva, 2003, Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy, Mach. Learn., 51, 181, 10.1023/A:1022859003006 S. Feng, H. Chen, X. Ren, Z. Ding, K. Li, X. Sun, Collaborative group learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, 2021, pp. 7431–7438. Romero, 2014 Y. Guan, P. Zhao, B. Wang, Y. Zhang, C. Yao, K. Bian, J. Tang, Differentiable feature aggregation search for knowledge distillation, in: Proceedings of the European Conference on Computer Vision (ECCV), 2020, pp. 469–484. B. Heo, M. Lee, S. Yun, J.Y. Choi, Knowledge transfer via distillation of activation boundaries formed by hidden neurons, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 2019, pp. 3779–3787. He, 2021, Incremental learning for exudate and hemorrhage segmentation on fundus images, Inf. Fusion, 73, 157, 10.1016/j.inffus.2021.02.017 Lesort, 2020, Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges, Inf. Fusion, 58, 52, 10.1016/j.inffus.2019.12.004 I. Chung, S. Park, J. Kim, N. Kwak, Feature-map-level online adversarial knowledge distillation, in: Proceedings of the International Conference on Machine Learning (ICML), 2020, pp. 2006–2015. J. Kim, M. Hyun, I. Chung, N. Kwak, Feature fusion for online mutual knowledge distillation, in: Proceedings of the International Conference on Pattern Recognition (ICPR), 2021, pp. 4619–4625. Y. Hou, Z. Ma, C. Liu, C.C. Loy, Learning lightweight lane detection cnns by self attention distillation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1013–1021. S. Yun, J. Park, K. Lee, J. Shin, Regularizing class-wise predictions via self-knowledge distillation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 13876–13885. M. Ji, S. Shin, S. Hwang, G. Park, I.-C. Moon, Refine myself by teaching myself: Feature refinement via self-knowledge distillation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 10664–10673. Z. Li, Y. Huang, D. Chen, T. Luo, N. Cai, Z. Pan, Online knowledge distillation via multi-branch diversity enhancement, in: Proceedings of the Asian Conference on Computer Vision (ACCV), 2020. J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 7132–7141. S. Woo, J. Park, J.-Y. Lee, I.S. Kweon, Cbam: Convolutional block attention module, in: Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3–19. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.-C. Chen, Mobilenetv2: Inverted residuals and linear bottlenecks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4510–4520. Cubuk, 2019, AutoAugment: Learning augmentation strategies from data, 113 W. Park, D. Kim, Y. Lu, M. Cho, Relational knowledge distillation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 3967–3976. F. Tung, G. Mori, Similarity-preserving knowledge distillation, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019, pp. 1365–1374. B. Heo, J. Kim, S. Yun, H. Park, N. Kwak, J.Y. Choi, A comprehensive overhaul of feature distillation, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019, pp. 1921–1930. Song, 2018, Collaborative learning for deep neural networks, Adv. Neural Inf. Process. Syst., 31 Zhu, 2018, Knowledge distillation by on-the-fly native ensemble, Adv. Neural Inf. Process. Syst., 31 Li, 2019, Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition, IEEE Trans. Image Process., 28, 356, 10.1109/TIP.2018.2868382 Zhao, 2018, Feature selection mechanism in CNNs for facial expression recognition, 317 Li, 2019, Occlusion aware facial expression recognition using CNN with attention mechanism, IEEE Trans. Image Process., 28, 2439, 10.1109/TIP.2018.2886767 Wang, 2020, Region attention networks for pose and occlusion robust facial expression recognition, IEEE Trans. Image Process., 29, 4057, 10.1109/TIP.2019.2956143 K. Wang, X. Peng, J. Yang, S. Lu, Y. Qiao, Suppressing uncertainties for large-scale facial expression recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 6897–6906. Fan, 2020, Facial expression recognition with deeply-supervised attention network, IEEE Trans. Affect. Comput. Li, 2020, Facial expression recognition in the wild using multi-level features and attention mechanisms, IEEE Trans. Affect. Comput. Li, 2021, Adaptively learning facial expression representation via C-F labels and distillation, IEEE Trans. Image Process., 30, 2016, 10.1109/TIP.2021.3049955 Ma, 2021, Facial expression recognition with visual transformers and attentional selective fusion, IEEE Trans. Affect. Comput., 10.1109/TAFFC.2021.3122146 L. Liu, Q. Huang, S. Lin, H. Xie, B. Wang, X. Chang, X. Liang, Exploring inter-channel correlation for diversity-preserved knowledge distillation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 8271–8280. Z. Shen, Z. He, X. Xue, Meal: Multi-model ensemble via adversarial learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 2019, pp. 4886–4893. R.R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626.