YOLOOD: Phương pháp phát hiện cáp phẳng linh hoạt theo hướng tùy ý trong lắp ráp robot

Springer Science and Business Media LLC - Tập 79 - Trang 14869-14893 - 2023
Yuxuan Bai1, Mingshuai Dong1, Shimin Wei1, Jian Li1, Xiuli Yu1
1School of Modern Post (School of Automation), Beijing University of Posts and Telecommunications, Beijing, China

Tóm tắt

Phát hiện cáp phẳng linh hoạt (FFC) là điều kiện tiên quyết trong lắp ráp 3C của robot và gặp nhiều thách thức do FFC thường không theo trục và có các hướng ngẫu nhiên trong môi trường xung quanh lộn xộn. Tuy nhiên, cho đến nay, các phương pháp phát hiện đối tượng truyền thống trong robot chủ yếu hồi quy hộp bao chứa nằm ngang của đối tượng, trong đó kích thước và tỷ lệ khung không phản ánh hình dạng thực tế của đối tượng mục tiêu và rất khó tách biệt các FFC trong môi trường dày đặc. Trong bài báo này, phát hiện đối tượng xoay đã được đưa vào phát hiện FFC, và một phương pháp phát hiện FFC theo hướng tùy ý dựa trên YOLO có tên là YOLOOD đã được đề xuất. Đầu tiên, hộp bao chứa theo hướng được sử dụng để phản ánh kích thước vật lý và thông tin góc của đối tượng, giúp tách biệt tốt hơn các FFC khỏi nền dày đặc. Thứ hai, thuật toán phân loại góc nhãn mượt mà hình tròn đã được áp dụng để thu được thông tin góc của các FFC. Cuối cùng, nhánh hồi quy điểm đầu được giới thiệu để phân biệt giữa đầu và đuôi của FFC, mở rộng phạm vi góc phát hiện FFC đến $$\left[ 0^{\circ }, 360^{\circ }\right) $$. YOLOOD được đề xuất có thể đạt được hiệu suất phát hiện với độ chính xác trung bình là 90.82% và tốc độ phát hiện là 112 FPS trên bộ dữ liệu FFC. Đồng thời, một thí nghiệm thực tế về việc nắm bắt FFC đã chứng minh tính hiệu quả và khả thi của YOLOOD trong các tình huống lắp ráp thực tế. Kết luận, bài báo này đã sáng tạo áp dụng phát hiện đối tượng quay vào phát hiện đối tượng của robot, và YOLOOD đã giải quyết vấn đề phát hiện và xác định các FFC không theo trục, có ý nghĩa đặc biệt cho lắp ráp 3C của robot.

Từ khóa

#cáp phẳng linh hoạt #phát hiện đối tượng #robot #YOLO #lắp ráp 3C

Tài liệu tham khảo

Cao Z, Hu H, Yang X, Lou Y, (2019) A robot 3c assembly skill learning method by intuitive human assembly demonstration. In: 2019 WRC Symposium on Advanced Robotics and Automation (WRC SARA), IEEE, pp 13–18 Loncomilla P, Ruiz-del-Solar J, Martínez L (2016) Object recognition using local invariant features for robotic applications: a survey. Pattern Recogn 60:499–514 Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110 Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), IEEEE, vol. 1, pp 886–893 Wang X, Han TX, Yan S (2009) An hog-lbp human detector with partial occlusion handling. In: 2009 IEEE 12th International Conference on Computer Vision, IEEE, pp. 32–39 He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778 Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788 Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28 (2015) Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 580–587 Du G, Wang K, Lian S, Zhao K (2021) Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review. Artif Intell Rev 54(3):1677–1734 Pahwa RS, Chang R, Jie W, Satini S, Viswanathan C, Yiming D, Jain V, Pang CT, Wah WK (2021) A survey on object detection performance with different data distributions. In: International Conference on Social Robotics, Springer, pp 553–563 ultralytics: yolov5. https://github.com/ultralytics/yolov5 Jiang Y, Zhu X, Wang X, Yang S, Li W, Wang H, Fu P, Luo Z (2017) R2CNN: Rotational region CNN for orientation robust scene text detection. arXiv preprint arXiv:1706.09579 Yang X, Yan J, Feng Z, He T (2021) R3det: Refined single-stage detector with feature refinement for rotating object. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3163–3171 Yang X, Yang J, Yan J, Zhang Y, Zhang T, Guo Z, Sun X, Fu K (2019) Scrdet: Towards more robust detection for small, cluttered and rotated objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8232–8241 Yang X, Yan J (2020) Arbitrary-oriented object detection with circular smooth label. In: European Conference on Computer Vision, Springer, pp 677–694 Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikäinen M (2020) Deep learning for generic object detection: a survey. Int J Comput Vision 128(2):261–318 Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1440–1448 Dai J, Li Y, He K, Sun J (2016) R-FCN: Object detection via region-based fully convolutional networks. Adv Neural Inf Process Syst 29:379–387 Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: Single shot multibox detector. In: European Conference on Computer Vision, Springer, pp 21–37 Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2980–2988 Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 734–750 Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6569–6578 Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 9627–9636 Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multim 20(11):3111–3122 Ding J, Xue N, Long Y, Xia G-S, Lu Q (2019) Learning roi transformer for oriented object detection in aerial images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2849–2858 Han J, Ding J, Li J, Xia G-S (2021) Align deep features for oriented object detection. IEEE Trans Geosci Remote Sens 60:1–11 Xie X, Cheng G, Wang J, Yao X, Han J (2021) Oriented R-CNN for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3520–3529 Wei H, Zhang Y, Chang Z, Li H, Wang H, Sun X (2020) Oriented objects as pairs of middle lines. ISPRS J Photogramm Remote Sens 169:268–279 Wei H, Zhang Y, Wang B, Yang Y, Li H, Wang H (2020) X-linenet: Detecting aircraft in remote sensing images by a pair of intersecting line segments. IEEE Trans Geosci Remote Sens 59(2):1645–1659 Georgakis G, Mousavian A, Berg AC, Kosecka J (2017) Synthesizing training data for object detection in indoor scenes. arXiv preprint arXiv:1702.07836 Novkovic T, Pautrat R, Furrer F, Breyer M, Siegwart R, Nieto J (2020) Object finding in cluttered scenes using interactive perception. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), IEEE, pp 8338–8344 Schwarz M, Milan A, Periyasamy AS, Behnke S (2018) RGB-D object detection and semantic segmentation for autonomous manipulation in clutter. Int J Robot Res 37(4–5):437–451 Zhang S, Nie Z, Tan J (2020) Novel objects detection for robotics grasp planning. In: 2020 10th Institute of Electrical and Electronics Engineers International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), IEEE, pp 43–48 Maiettini E, Pasquale G, Rosasco L, Natale L (2018) Speeding-up object detection training for robotics with falkon. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, pp 5770–5776 Maiettini E, Pasquale G, Rosasco L, Natale L (2020) On-line object detection: a robotics challenge. Auton Robot 44(5):739–757 Bajcsy R, Aloimonos Y, Tsotsos JK (2018) Revisiting active perception. Auton Robot 42(2):177–196 Browatzki B, Tikhanoff V, Metta G, Bülthoff HH, Wallraven C (2012) Active object recognition on a humanoid robot. In: 2012 IEEE International Conference on Robotics and Automation, IEEE, pp 2021–2028 Yang X, Sun H, Fu K, Yang J, Sun X, Yan M, Guo Z (2018) Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks. Remote Sens 10(1):132 Liu Y, Zhang S, Jin L, Xie L, Wu Y, Wang Z (2019) Omnidirectional scene text detection with sequential-free box discretization. arXiv preprint arXiv:1906.02371 Xu Y, Fu M, Wang Q, Wang Y, Chen K, Xia G-S, Bai X (2020) Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans Pattern Anal Mach Intell 43(4):1452–1459 Bradski G (2000) The opencv library. Dr. Dobb’s. J Softw Tools Profess Progr 25(11):120–123 Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125 Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8759–8768 Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR’06), IEEE, vol. 3, pp 850–855 Feng Z-H, Kittler J, Awais M, Huber P, Wu X-J (2018) Wing loss for robust facial landmark localisation with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2235–2245 Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-iou loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp 12993–13000 Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (208) Dota: A large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3974–3983 Liu Z, Yuan L, Weng L, Yang Y (2017) A high resolution optical satellite image dataset for ship recognition and some new baselines. In: ICPRAM, pp 324–331 Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2921–2929 Yang X, Zhou Y, Zhang G, Yang J, Wang W, Yan J, Zhang X, Tian Q (2022) The kfiou loss for rotated object detection. arXiv preprint arXiv:2201.12558 Hou L, Lu K, Xue J, Li Y (2022) Shape-adaptive selection and measurement for oriented object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence Han J, Ding J, Xue N, Xia G-S (2021) Redet: A rotation-equivariant detector for aerial object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2786–2795