Method for detection of unsafe actions in power field based on edge computing architecture
Tóm tắt
Due to the high risk factors in the electric power industry, the safety of power system can be improved by using the surveillance system to predict and warn the operators’ nonstandard and unsafe actions in real time. In this paper, aiming at the real-time and accuracy requirements in video intelligent surveillance, a method based on edge computing architecture is proposed to judge unsafe actions of electric power operations in time. In this method, the service of unsafe actions judgment is deployed to the edge cloud, which improves the real-time performance. In order to identify the action being executed, the end-to-end action recognition model proposed in this paper uses the Temporal Convolutional Neural Network (TCN) to extract local temporal features and a Gate Recurrent Unit (GRU) layer to extract global temporal features, which increases the accuracy of action fragment recognition. The result of action recognition is combined with the result of equipment target recognition based on the yolov3 model, and the classification rule is used to determine whether the current action is safe. Experiments show that the proposed method has better real-time performance, and the proposed action cognition is verified on the MSRAction Dataset, which improves the recognition accuracy of action segments. At the same time, the judgment results of unsafe actions also prove the effectiveness of the proposed method.
Tài liệu tham khảo
Ming L, Lin Y, Xianwei L, Fan Z, Jiaming Z (2018) Application of intelligent video analysis technology in power equipment monitoring. Northeast Power Technol 39(10):26–29.
Junhuang Z, Tingcheng H, Xiaoyu X, Wenjun F, Tingting Y, Yongjun Z (2020) Application of video image intelligent recognition technology in power transmission and transformation system. China Electr Power 54(11):1–12.
Qing L, Rongrong S, Huanbin C (2019) Application and practice of high voltage power line online video monitoring system. Digit Technol Appl 37(12):162–163.
Hui W, Mingjun L, Yingyi Y, Hao W, Qiang S (2019) Target detection method and Optimization for Substation Video Monitoring Terminal. Guangdong Electr Power 32(09):62–68.
Yanru W, Haifeng L, Lin L, Lanfang L, Zhenya Y (2019) Application of image recognition technology based on edge intelligent analysis in transmission line online monitoring. Power Inf Commun Technol 17(07):35–40.
Yanqiao L, Cuiying S, Hongwei C, Hongwei Y (2020) Foreign matter detection method for transmission equipment based on edge calculation and deep learning [J]. China Electr Power 53(06):27–33.
Satyanarayanan M, Bahl P, Caceres R, et al. (2009) The case for VM-based cloudlets in mobile computing. IEEE Pervasive Comput 8(4):14–23.
Bonomi F (2011) Connected vehicles, the internet of things,and fog computing. The Eighth ACM International Workshop on Vehicular Inter-Networking (VANET). Association for Computing Machinery, Las Vegas.
Shi W, Cao J, Zhang Q, Li Y, Xu L (2016) Edge computing: vision and challenges. Internet Things J IEEE 3(5):637–646.
Edge computing industry alliance (ECC), Industrial Internet Industry Alliance (AII) (2020) Edge computing reference architecture [OL]. http://www.ecconsortium.org/Lists/show/id/334.html. Accessed 29 Sept 2020.
Yu Z, Jie Y, Miao L, Jinlong S, Guan G (2020) Intelligent edge computing technology based on Federated learning for video surveillance. Acta Telecom Sin 41(10):109–115.
Xiaoqian J, Gang C, Baibing L (2020) Application of edge computing in video surveillance. Comput Eng Appl 56(17):86–92.
Yi S, Hao Z, Qin Z, Li Q (2016) Fog Computing: Platform and Applications. 2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb). IEEE Inc., Washington D.C.
Hu H, Shan H, Wang C, Sun T, Zhen X, Yang K, et al. (2020) Video surveillance on mobile edge networks—a reinforcement-learning-based approach. IEEE Internet Things J 7(6):4746–4760.
Klser A, Marszalek M, Schmid C (2010) A Spatio-Temporal Descriptor Based on 3D-Gradients. British Machine Vision Conference, Leeds.
Wang H, Klser A, Schmid C, Liu CL (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79.
Ha J, Park J, Kim H, Park H, Paik J (2018) Violence detection for video surveillance system using irregular motion information. 2018 International Conference on Electronics, Information, and Communication (ICEIC). IEEE Inc., Honolulu.
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. Comput Linguist 1(4):568.
Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional Two-Stream Network Fusion for Video Action Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Las Vegas.
Ji S, Xu W, Yang M, Yu K (2013) 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231.
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning Spatiotemporal Features with 3D Convolutional Networks. IEEE International Conference on Computer Vision (ICCV). IEEE Inc., Santiago.
Saveliev A, Uzdiaev M, Dmitrii M (2019) Aggressive Action Recognition Using 3D CNN Architectures, 2019 12th International Conference on Developments in eSystems Engineering (DeSE). IEEE Inc., Kazan.
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning Spatiotemporal Features with 3D Convolutional Networks, 2015 IEEE International Conference on Computer Vision (ICCV), Santiago.
Liu J, Luo J, Shah M (2009) Recognizing realistic actions from videos “in the wild”, 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, Miami.
Graves A, Mohamed AR, Hinton G (2013) Speech recognition with deep recurrent neural networks. Acoustics Speech & Signal Processing. icassp.international Conference on. IEEE Inc., Vancouver.
Hochreiter S, Schmidhuber J (1997) Long Short-Term Memory. Neural Comput 9(8):1735–1780.
Cho K, Van Merrienboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. CoRR abs/1409.1259:103–111.
Donahue J, et al. (2016) Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Trans Pattern Anal Mach Intell 39(4):677–691.
Jiang Y, Wu Z, Tang J, Li Z, Xue X, Chang S (2018) Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification,. IEEE Trans Multimed 20(11):3137–3147.
Wei S, Song Y, Zhang Y (2017) Human skeleton tree recurrent neural network with joint relative motion feature for skeleton based action recognition, 2017 IEEE International Conference on Image Processing (ICIP). IEEE Computer Society, Beijing.
Song S, Lan C, Xing J, Zeng W, Liu J (2018) Spatio-temporal attention-based lstm networks for 3d action recognition and detection. IEEE Trans Image Process 27(7):3459–3471.
Kim TS, Reiter A (2017) Interpretable 3D Human Action Analysis with Temporal Convolutional Networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu. https://doi.org/10.1109/CVPRW.2017.207.
Fragkiadaki K, Levine S, Felsen P, Malik J (2015) Recurrent Network Models for Human Dynamics. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago. https://doi.org/10.1109/ICCV.2015.494.
Jain A, Zamir AR, Savarese S, Saxena A (2016) Structural-RNN: Deep Learning on Spatio-Temporal Graphs. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas. https://doi.org/10.1109/CVPR.2016.573.
Martinez J, Black MJ, Romero J (2017) On Human Motion Prediction Using Recurrent Neural Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu. https://doi.org/10.1109/CVPR.2017.497.
Ke Q, Bennamoun M, Rahmani H, An S, Sohel F, Boussaid F (2020) Learning Latent Global Network for Skeleton-Based Action Prediction. IEEE Trans Image Process 29:959–970.
Kong Y, Tao Z, Fu Y (2020) Adversarial Action Prediction Networks. IEEE Trans Pattern Anal Mach Intell 42(3):539–553.
Lea C, Flynn MD, Vidal R, Reiter A, Hager GD (2017) Temporal Convolutional Networks for Action Segmentation and Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu. https://doi.org/10.1109/CVPR.2017.113.
Kim TS, Reiter A (2017) Interpretable 3D Human Action Analysis with Temporal Convolutional Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu. https://doi.org/10.1109/CVPRW.2017.207.
He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas. https://doi.org/10.1109/CVPR.2016.90.
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas. https://doi.org/10.1109/CVPR.2016.91.
Redmon J, Farhadi A (2018) YOLOv3: An incremental improvement. arXiv:1804.02767. https://arxiv.org/abs/1804.02767. Accessed 26 Sept 2020.
Li W, Zhang Z, Liu Z (2020) Action recognition based on a bag of 3D points. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, San Francisco. https://doi.org/10.1109/CVPRW.2010.5543273.