Reinforcement learning-based collision-free path planner for redundant robot in narrow duct

Journal of Intelligent Manufacturing - Tập 32 - Trang 471-482 - 2020

Xiaotong Hua¹, Guolei Wang¹, Jing Xu¹, Ken Chen¹

¹State Key Laboratory of Tribology, Department of Mechanical Engineering, Tsinghua University, Beijing, China

Tóm tắt

Compared with obstacle avoidance in open environment, collision-free path planning for duct-enter task is often challenged by narrow and complex space inside ducts. For obstacle avoidance, redundant robot is usually applied for this task. The motion of redundant robot can be decoupled to end-effector motion and self-motion. Current methods for duct-enter task are not robust due to the difficulty to properly define the self-motion. This difficulty mainly comes from two aspects: the definition of distances from robot to obstacles and the fusion of multiple data. In this work, we adapt the ideas underlying the success of human to handling this kind tasks, variable optimization strategies and learning, for one robust path planner. Proposed planner applies reinforcement learning skills to learn proper self-motion and achieves robust planning. For achieving robust behavior, state-action planner is creatively designed with three especially designed strategies. Firstly, optimization function, the kernel part of self-motion, is considered as part of action. Instead of taking every joint motion, this strategy embeds reinforcement learning skills on self-motion, reducing the search domain to null space of redundant robot. Secondly, robot end orientation is taken into action. For duct-enter task, robot end link is the motion starter for exploring movement just like the snake head. The orientation of robot end link when passing through some position can be referred by following links. Hence the second strategy can accelerate exploring by reduce the null space to possible redundant robot manifold. Thirdly, path guide point is also added into action part. This strategy can divide one long distance task into several short distance tasks, reducing the task difficulty. After these creative designs, the planner has been trained with reinforcement learning skills. With the feedback of robot and environment state, proposed planner can choose proper optimization strategies, just like the human brain, for avoiding collision between robot body and target duct. Compared with two general methods, Virtual Axis method with orientation Guidance and Virtual Axis, experiment results show that the success rate is separately improved by 5.9% and 49.7%. And two different situation experiments are carried out on proposed planner. Proposed planner achieves 100% success rate in the situation with constant start point and achieves 98.7% success rate in the situation with random start point meaning that the proposed planner can handle the perturbation of start point and goal point. The experiments proves the robustness of proposed planner.

Tài liệu tham khảo

Adamovich, B. A., & Derbichev, A. B. (2010). A technology for treating inner surface of trunk pipelines in weld areas. Chemical and Petroleum Engineering, 46(1–2), 115–117. Baillieul, J. (1986). Avoiding obstacles and resolving kinematic redundancy. In Proceedings of 1986 IEEE international conference on robotics and automation (pp. 1698–1704). Benzaoui, M., & Chekireb, H. (2007). Redundant robot manipulator control with obstacles avoidance using self-motion approach. In Proceedings of the 13th IASTED international conference on robotics and applications (pp. 21–26). Chen, W., Chen, Y., Li, B., Zhang, W., & Chen, K. (2016). Design of redundant robot painting system for long non-regular duct. Industrial Robot: An International Journal, 43(1), 58–64. Chiang, H. L., & Tapia, L. (2018). COLREG-RRT: An RRT-based COLREGS-compliant motion planner for surface vehicle navigation. IEEE Robotics and Automation Letters, 3(3), 2024–2031. Choi, H. R., & Ryew, S. (2002). Robotic system with active steering capability for internal inspection of urban gas pipelines. Mechatronics, 12(5), 713–736. Confessore, G., Fabiano, M., & Liotta, G. (2011). A network flow based heuristic approach for optimising AGV movements. Journal of Intelligent Manufacturing, 24(2), 405–419. Ghajari, M. F., & Mayorga, R. V. (2017). Specialized PRM trajectory planning for hyper-redundant robot manipulators. WSEAS Transactions on Systems, 16, 254–260. Hart, P. E., Nilsson, N. J., & Raphael, B. (1968). A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics, 4(2), 100–107. Kapanoglu, M., Alikalfa, M., Ozkan, M., Yazıcı, A., & Parlaktuna, O. (2010). A pattern-based genetic algorithm for multi-robot coverage path planning minimizing completion time. Journal of Intelligent Manufacturing, 23(4), 1035–1045. Kavraki, L. E., Svestka, P., Latombe, J., & Overmars, M. H. (1996). Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12(4), 566–580. Khatib, O. (1986). Real-time obstacle avoidance for manipulators and mobile robots. The International Journal of Robotics Research, 5(1), 90–98. Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv e-prints arXiv:1412.6980. Lauretti, C., Cordella, F., Ciancio, A. L., Trigili, E., Catalan, J. M., Badesa, F. J., et al. (2018). Learning by demonstration for motion planning of upper-limb exoskeletons. Frontiers in Neurorobotics, 12, 5. LaValle, S. M. (2006). Planning algorithms. NewYork, NY: Cambridge University Press. LaValle, S. M., & Kuffner, J. J. (2000). Rapidly-exploring random trees: Progress and prospects. In 4th workshop on algorithmic and computational robotics: New directions (pp. 293–308). Lee, S., & Park, J. (1991). Neural computation for collision-free path planning. Journal of Intelligent Manufacturing, 2(5), 315–326. Li, X., Sun, Z., Cao, D., Liu, D., & He, H. (2017). Development of a new integrated local trajectory planning and tracking control framework for autonomous ground vehicles. Mechanical Systems and Signal Processing, 87, 118–137. Liegeois, A. (1977). Automatic supervisory control of the configuration and behaviour of multibody mechanisms. IEEE Transactions on Systems, Man, and Cybernetics, 7(12), 868–871. Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., et al. (2015). Continuous control with deep reinforcement learning. arXiv e-prints arXiv:1509.02971. Lim, W., Lee, S., Sunwoo, M., & Jo, K. (2018). Hierarchical trajectory planning of an autonomous car based on the integration of a sampling and an optimization method. IEEE Transactions on Intelligent Transportation Systems, 19(2), 1–14. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. Pashkevich, A., & Kazheunikau, M. (2005). Neural network approach to trajectory synthesis for robotic manipulators. Journal of Intelligent Manufacturing, 16(2), 173–187. Ren, T., Dong, Y., Wu, D., & Chen, K. (2018). Learning-based variable compliance control for robotic assembly. Journal of Mechanisms and Robotics, 10(6), 061008. Sciavicco, L., & Siciliano, B. (1988). A solution algorithm to the inverse kinematic problem for redundant manipulators. IEEE Journal on Robotics and Automation, 4(4), 403–410. Shkolnik, A., & Tedrake, R. (2009). Path planning in 1000 + dimensions using a task-space Voronoi bias. In Proceedings of 2009 IEEE international conference on robotics and automation (pp. 2061–2067). Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., & Riedmiller, M. (2014). Deterministic policy gradient algorithms. In Proceedings of the 31st international conference on machine learning, Beijing (pp. 387–395). Singh, H. P., & Sukavanam, N. (2012). Neural network based control scheme for redundant robot manipulators subject to multiple self-motion criteria. Mathematical and Computer Modelling, 55(3), 1275–1300. Suton, R., & Barto, A. (2002). Adaptive computation and machine learning. In R. S. Sutton & A. G. Barto (Eds.), Reinforcement learning: An introduction. Cambridge, MA: MIT Press.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA