IJEEDC Bio-Inspired Hyper-Redundant Robotic Arm Control with Hierarchical Deep Reinforcement Learning

Journal Paper

Paper Title :Bio-Inspired Hyper-Redundant Robotic Arm Control with Hierarchical Deep Reinforcement Learning

Author :Sayyed Jaffar Ali Raza, Mingjie Lin

Article Citation :Sayyed Jaffar Ali Raza ,Mingjie Lin , (2018 ) " Bio-Inspired Hyper-Redundant Robotic Arm Control with Hierarchical Deep Reinforcement Learning " , International Journal of Electrical, Electronics and Data Communication (IJEEDC) , pp. 63-69, Volume-6,Issue-10

Abstract : In addition to performing sophisticated locomo-tion, robotic arms with hyper-redundant DOFs can more effectively circumvent obstacles and more robustly avoid me-chanical failure. Unfortunately, for such a hyper-redundant robotic arm, self-learning objective-driven behavior with con-ventional reinforcement learning algorithms proves to be quite challenging. This difficulty stems from extremely large state and action spaces that often render robust learning of value functions highly ineffective, consequently leading to insufficient policy exploration. This challenge is reminiscent of the so-called “Curse of Dimensionality” problem due to exponential explosion of states and actions that entail exponentially more data and computation. In this work, we draw the inspiration from how an octopus achieves extremely dexterous maneuverability that controls virtually infinite DOFs. In particular, for an octopus, unlike the centralized encephalization found in humans, its central brain doesn’t “mico-manage” and issue continuous signals to control each of its arms. Instead, each of these arms enjoys high degree of autonomy, i.e., they operate on their own volition, thus completely deviating from human’s centralized and brain-directly limb movement. As such, we devise and implement a layered learning algorithm that integrates global deep Q-learning and local Q-learning algorithms collaboratively to effectively control a robotic arm with huge DOFs. Specifically, we construct a global deep Q-network to learn a policy that generate local objectives over a global objective. Simultaneously, multiple local agents learn a local policy given each individual local objective. To illustrate the effectiveness of our layered learning scheme, we implemented a 24-DOF robotic arm that learns its control policy autonomously. We compare the learning performance of this hyper-redundant robotic arm with our new scheme against the conventional learning algorithm without hierarchical struc-ture. Our results have shown that, with the same amount of computational effort, our new scheme have significantly higher learning success rates and much better award convergence.

Type : Research paper

Published : Volume-6,Issue-10


	\|		PDF	\|	Viewed - 70	\|	Published on 2018-12-27

Apr. 2024
Submitted Papers	:	80
Accepted Papers	:	10
Rejected Papers	:	70
Acc. Perc	:	12%
Issue Published	:	133
Paper Published	:	1712
No. of Authors	:	4737

Published : Volume-6,Issue-10

JOURNAL SUPPORTED BY