Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Về khả năng giải thích hậu kiểm của mạng trạng thái vang sâu cho dự báo chuỗi thời gian, phân loại hình ảnh và video

Neural Computing and Applications - Tập 34 - Trang 10257-10277 - 2021

Alejandro Barredo Arrieta¹, Sergio Gil-Lopez¹, Ibai Laña¹, Miren Nekane Bilbao², Javier Del Ser¹

¹TECNALIA, Basque Research and Technology Alliance (BRTA), Derio, Spain

²University of the Basque Country (UPV/EHU), Bilbao, Spain

Tóm tắt

Kể từ khi ra đời, các kỹ thuật học dưới mô hình tính toán hồ chứa đã cho thấy khả năng mô hình hóa tuyệt vời cho các hệ thống hồi tiếp mà không cần đến khối lượng tính toán nặng nề như các phương pháp khác, đặc biệt là mạng nơ-ron sâu. Trong số đó, các phiên bản khác nhau của mạng trạng thái vang đã thu hút nhiều sự chú ý qua thời gian, chủ yếu nhờ vào sự đơn giản và hiệu quả tính toán của thuật toán học của chúng. Tuy nhiên, những lợi thế này không thể bù đắp cho thực tế rằng mạng trạng thái vang vẫn là các mô hình hộp đen mà các quyết định của chúng không thể dễ dàng giải thích cho công chúng. Vấn đề này trở nên phức tạp hơn cho các mạng trạng thái vang đa lớp (còn được gọi là sâu), bởi vì cấu trúc phân cấp phức tạp của chúng càng làm khó khăn cho việc giải thích nội bộ của chúng đối với người dùng thiếu chuyên môn về học máy hoặc thậm chí khoa học máy tính. Sự thiếu giải thích này có thể gây nguy hiểm cho việc áp dụng rộng rãi các mô hình này trong một số lĩnh vực nơi mà trách nhiệm và khả năng hiểu biết của các mô hình học máy là điều cần thiết (ví dụ: chẩn đoán y tế, chính trị xã hội). Công trình này giải quyết vấn đề này bằng cách tiến hành một nghiên cứu về khả năng giải thích của các mạng trạng thái vang khi áp dụng cho các tác vụ học với dữ liệu chuỗi thời gian, hình ảnh và video. Trong số các tác vụ này, chúng tôi nhấn mạnh tác vụ cuối cùng (phân loại video) mà, theo như chúng tôi biết, chưa bao giờ được thực hiện trước đây với mạng trạng thái vang trong tài liệu liên quan. Cụ thể, nghiên cứu này đề xuất ba kỹ thuật khác nhau có khả năng khai thác thông tin hiểu biết về kiến thức mà các mô hình hồi tiếp này nắm giữ, cụ thể là bộ nhớ tiềm năng, mẫu thời gian và hiệu ứng vắng mặt pixel. Bộ nhớ tiềm năng đề cập đến các câu hỏi liên quan đến ảnh hưởng của kích thước hồ chứa trong khả năng của mô hình để lưu trữ thông tin tạm thời, trong khi các mẫu thời gian tiết lộ mối liên hệ hồi tiếp mà mô hình nắm bắt theo thời gian. Cuối cùng, hiệu ứng vắng mặt pixel cố gắng đánh giá ảnh hưởng của việc vắng mặt của một pixel nhất định khi mô hình mạng trạng thái vang được sử dụng để phân loại hình ảnh và video. Lợi ích của bộ kỹ thuật đề xuất được trình bày trên ba lĩnh vực áp dụng khác nhau: mô hình chuỗi thời gian, hình ảnh và, lần đầu tiên trong tài liệu liên quan, phân loại video. Các kết quả thu được cho thấy các kỹ thuật đề xuất không chỉ cho phép hiểu biết khoa học về cách thức hoạt động của các mô hình này, mà còn phục vụ như công cụ chẩn đoán có khả năng phát hiện các vấn đề xuất phát từ dữ liệu (ví dụ: sự hiện diện của thiên lệch ẩn).

Từ khóa

#mạng trạng thái vang #khả năng giải thích #chuỗi thời gian #phân loại video #học máy

Tài liệu tham khảo

Jaeger H (2003) Adaptive nonlinear system identification with echo state networks. In: Advances in neural information processing systems, pp 609–616 Lukoševičius M, Jaeger H (2009) Reservoir computing approaches to recurrent neural network training. Comput Sci Rev 3(3):127–149 Gallicchio C, Scardapane S (2020) Deep randomized neural networks. Recent Trends Learn Data, pp 43–68 Zhang L, Suganthan PN (2016) A survey of randomized algorithms for training neural networks. Inf Sci 364:146–155 Jaeger H, Haas H (2004) Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667):78–80 Wu Q, Fokoue E, Kudithipudi D (2018) On the statistical challenges of echo state networks and some potential remedies. arXiv:1802.07369 Jaeger H (2005) Reservoir riddles: suggestions for echo state network research. In:Proceedings. 2005 IEEE international joint conference on neural networks, vol 3, pp 1460–1462. IEEE Luca AT, Ulrich P (2019) Gradient based hyperparameter optimization in echo state networks. Neural Netw 115:23–29 Öztürk MM, Cankaya IA, Ipekci D (2020) Optimizing echo state network through a novel fisher maximization based stochastic gradient descent. Neurocomputing Arrieta AB, Díaz-Rodríguez N, Del SJ, Bennetot A, Tabik S, Barbado A, Salvador G, Sergio G-L, Daniel M, Richard B, et al (2020) Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible ai. Inf Fusion 58:82–115 Gallicchio C, Micheli A, Pedrelli L (2017) Deep reservoir computing: a critical experimental analysis. Neurocomputing 268:87–99 Maass W, Natschläger T, Markram H (2002) Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural comput 14(11):2531–2560 Jaeger H (2001) The “echo state’’ approach to analysing and training recurrent neural networks-with an erratum note. Bonn, Germany: German National Research Center for Information Technology GMD Technical Report 148(34):13 Dominey PF (1995) Complex sensory-motor sequence learning based on recurrent state representation and reinforcement learning. Biol Cybern 73(3):265–274 Steil JJ (2004) Backpropagation-decorrelation: online recurrent learning with o (n) complexity. In: 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541), vol 2, pp 843–848. IEEE Del S, Javier L, Ibai, M, Eric L, Oregi I, Osaba E, Lobo JL, Bilbao MN, Vlahogianni EI (2020) Deep echo state networks for short-term traffic forecasting: performance comparison and statistical assessment. In: IEEE international conference on intelligent transportation systems (ITSC), pp 1–6. IEEE Palumbo F Gallicchio C, Pucci R, Micheli A (2016) Human activity recognition using multisensor data fusion based on reservoir computing. J Ambient Intell Smart Environ 8(2):87–107 Crisostomi E, Gallicchio C, Micheli A, Raugi M, Tucci M (2015) Prediction of the italian electricity price for smart grid applications. Neurocomputing 170:286–295 Jaeger H, Lukoševičius M, Popovici D, Siewert U (2007) Optimization and applications of echo state networks with leaky-integrator neurons. Neural Netw 20(3):335–352 Gallicchio C, Micheli A (2019) Richness of deep echo state network dynamics. In: International work-conference on artificial neural networks, pp 480–491 Gallicchio C, Micheli A (2017) Echo state property of deep reservoir computing networks. Cognit Comput 9(3):337–350 Jaeger H (2002) Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the “echo state network” approach, volume 5. GMD-Forschungszentrum Informationstechnik Bonn Gallicchio C, Micheli A, Pedrelli L (2018) Design of deep echo state networks. Neural Netw 108:33–47 Liu K, Zhang J (2020) Nonlinear process modelling using echo state networks optimised by covariance matrix adaption evolutionary strategy. Comput Chem Eng 135:106730 Arras L, Montavon G, Müller K-R, Samek W (2017) Explaining recurrent neural network predictions in sentiment analysis. In: Proceedings of the 8th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 159–168 Li J, Chen X, Hovy E, Jurafsky D (2016) Visualizing and understanding neural models in nlp. In: Proceedings of NAACL-HLT, pp 681–691 Denil M, Demiraj A, De Freitas N (2014) Extraction of salient sentences from labelled documents. arXiv:1412.6815 Li J, Monroe W, Jurafsky D (2016) Understanding neural networks through representation erasure. arXiv:1612.08220 Kádár A, Chrupała G, Alishahi A (2017) Representation of linguistic form and function in recurrent neural networks. Comput Linguist 43(4):761–780 Murdoch W, James L, Peter J, Yu B (2018) Beyond word importance: contextual decomposition to extract interactions from lstms. arXiv:1801.05453 Hassaballah M, Awad AI (2020) Deep learning in computer vision: principles and applications. CRC Press, Boca Raton Rojat T, Puget R, Filliat D, Del S, Javier G, Rodolphe í-R, Natalia D (2021) Explainable artificial intelligence (xai) on time series data: a survey. arXiv:2104.00950 Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing sax: a novel symbolic representation of time series. Data Mining Knowl Discov 15(2):107–144 Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, pp 2–11 Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3):263–286 Zadeh LA (1988) Fuzzy logic. Computer 21(4):83–93 Herrera F, Herrera-Viedma E, Martinez L (2000) A fusion approach for managing multi-granularity linguistic term sets in decision making. Fuzzy Sets Syst 114(1):43–58 Herrera F, Alonso S, Chiclana Francisco H-VE (2009) Computing with words in decision making: foundations, trends and prospects. Fuzzy Optim Decis Making 8(4):337–364 Mencar C, Alonso JM (2018) Paving the way to explainable artificial intelligence with fuzzy modeling. In: International Workshop on Fuzzy Logic and Applications, pp 215–227. Springer Samek W, Montavon G, Vedaldi A, Hansen LK, Müller K-R (2019) Explainable AI: interpreting, explaining and visualizing deep learning, vol 11700. Springer Chang Y-W, Lin C-J (2008) Feature ranking using linear svm. In: Causation and prediction challenge, pp 53–64. PMLR Lundberg SM, Erion GG, Lee S-I (2018) Consistent individualized feature attribution for tree ensembles. arXiv:1802.03888 Smilkov D, Thorat N, Kim B, Viégas F, Wattenberg M (2017) Smoothgrad: removing noise by adding noise. arXiv:1706.03825 Adebayo J, Gilmer J, Muelly M, Goodfellow I, Hardt M, Kim B (2018) Sanity checks for saliency maps. arXiv:1810.03292 Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M (2014) Striving for simplicity: the all convolutional net. arXiv:1412.6806 Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626 Montavon G, Lapuschkin S, Binder A, Samek W, Müller K-R (2017) Explaining nonlinear classification decisions with deep taylor decomposition. Pattern Recognit 65:211–222 Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv:1312.6034 Ancona M, Ceolini E, Öztireli C, Gross M (2017) Towards better understanding of gradient-based attribution methods for deep neural networks. arXiv:1711.06104 Baehrens D, Schroeter T, Harmeling S, Kawanabe M, Hansen K, Müller K-R (2010) How to explain individual classification decisions. J Mach Learn Res 11:1803–1831 Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. In: International conference on machine learning, pp 3145–3153. PMLR Ribeiro MT, Singh S, Guestrin C (2016) “Why should i trust you?” Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144 Marwan N, Romano MC, Thiel M, Kurths J (2007) Recurrence plots for the analysis of complex systems. Phys Rep 438(5–6):237–329 Eckmann J-P, Kamphorst SO, Ruelle D, et al (1995) Recurrence plots of dynamical systems. World Sci Ser Nonlinear Sci Ser A 16:441–446 Gallicchio C, Micheli A (2016) Deep reservoir computing: a critical analysis. In: ESANN Schaetti N, Salomon M, Couturier R (2016) Echo state networks-based reservoir computing for mnist handwritten digits recognition. In: IEEE international conference on computational science and engineering (CSE), pp 484–491. IEEE Woodward A, Ikegami T (2011) A reservoir computing approach to image classification using coupled echo state and back-propagation neural networks. In International conference image and vision computing, Auckland, New Zealand, pp 543–458 Souahlia A, Belatreche A, Benyettou A, Curran K (2016) An experimental evaluation of echo state network for colour image segmentation. In: 2016 International joint conference on neural networks (IJCNN), pp 1143–1150. IEEE Tong Z, Tanaka G (2018) Reservoir computing with untrained convolutional neural networks for image recognition. In: International conference on pattern recognition (ICPR), pp 1289–1294. IEEE Shi X, Chen Z, Wang H, Yeung D-Y, Wong W-K, Woo W-C (2015) Convolutional lstm network: a machine learning approach for precipitation nowcasting. arXiv:1506.04214 Laña I, Del SJ, Padró A, Vélez M, Casanova-Mateo C (2016) The role of local urban traffic and meteorological conditions in air pollution: a data-based case study in Madrid. Spain. Atmos Environ 145:424–438 Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. In: Proceedings of the 17th International conference on pattern recognition, 2004. ICPR 2004., volume 3, pp 32–36. IEEE Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: Tenth IEEE international conference on computer vision (ICCV’05) Volume 1, volume 2, pp 1395–1402. IEEE Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image Understand 104(2–3):249–257 Liu J, Luo J, Shah M (2009) Recognizing realistic actions from videos “in the wild”. In: 2009 IEEE conference on computer vision and pattern recognition, pp 1996–2003. IEEE Reddy KK, Shah M (2013) Recognizing 50 human action categories of web videos. Mach Vis Appl 24(5):971–981 Soomro K, Zamir AR, Shah M: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv:1212.0402 Rodriguez MD, Ahmed J, Shah M (2008) Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: IEEE conference on computer vision and pattern recognition, pp 1–8. IEEE Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) Hmdb: a large video database for human motion recognition. In: 2011 International conference on computer vision, pp 2556–2563. IEEE LeCun Y (1998) The mnist database of handwritten digits. http://yann.lecun.com/exdb/mnist/ Han D, Bo L, Sminchisescu C (2009) Selection and context for action recognition. In: 2009 IEEE 12th international conference on computer vision, pp 1933–1940 Ghadiyaram D, Tran D, Mahajan D (2019) Large-scale weakly-supervised pre-training for video action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12046–12055 Baccouche M, Mamalet F, Wolf C, Garcia C, Baskurt A (2011) Sequential deep learning for human action recognition. In: International workshop on human behavior understanding, pp 29–39. Springer Shu Na, Tang Q, Liu H (2014) A bio-inspired approach modeling spiking neural networks of visual cortex for human action recognition. In: 2014 international joint conference on neural networks (IJCNN), pp 3450–3457. IEEE Liu J, Shah M (2008) Learning human actions via information maximization. In: IEEE conference on computer vision and pattern recognition, pp 1–8. IEEE Sharma S, Kiros R, Salakhutdinov R (2015) Action recognition using visual attention. (2015). arXiv:1511.04119 Shi Y, Zeng W, Huang T, Wang Y (2015) Learning deep trajectory descriptor for action recognition in videos using deep neural networks. In: 2015 IEEE international conference on multimedia and expo (ICME), pp 1–6. IEEE Wang L, Qiao Y, Tang X (2015) Action recognition with trajectory-pooled deep-convolutional descriptors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4305–4314 Harandi MT, Sanderson C, Shirazi S, Lovell BC (2013) Kernel analysis on grassmann manifolds for action recognition. Pattern Recognit Lett 34(15):1906–1915

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA