基于速度平滑确定性策略梯度的机械臂路径规划方法

发明公开

CN110328668A 基于速度平滑确定性策略梯度的机械臂路径规划方法有权

请登陆查看更多内容

专利标题： 基于速度平滑确定性策略梯度的机械臂路径规划方法
专利标题（英）： Mechanical arm path planning method based on velocity smoothing deterministic policy gradient
申请号： CN201910685553.8

申请日： 2019-07-27
公开(公告)号： CN110328668A

公开(公告)日： 2019-10-15
发明人: 吴巍 , 郭毓 , 郭健 , 肖潇 , 蔡梁 , 吴益飞 , 吴钧浩 , 郭飞 , 张冕
申请人： 南京理工大学
申请人地址： 江苏省南京市玄武区孝陵卫200号
专利权人： 南京理工大学
当前专利权人： 南京理工大学
当前专利权人地址： 江苏省南京市玄武区孝陵卫200号
代理机构： 南京理工大学专利中心
代理商 封睿
主分类号： B25J9/16
IPC分类号： B25J9/16

摘要：

本发明公开了一种基于速度平滑确定性策略梯度的机械臂路径规划方法，训练阶段构建带有作业任务反馈的机械臂仿真环境；在确定性策略梯度网络输入中引入前一步机械臂动作向量，构建基于速度平滑确定性策略梯度的强化学习网络框架；初始化网络训练参数和机械臂仿真环境；基于速度平滑确定性策略梯度网络和仿真环境获取样本，构建训练样本库，若训练样本数量达到最大样本数量，则按单次训练样本数量从训练样本库中抽取训练样本，训练速度平滑确定性策略梯度网络，否则进行下一步或者下一次仿真。本发明在确定性策略梯度网络的基础上，加入前一步速度向量作为网络输入，有效降低了关节加速度，减少了机械臂抖动。

摘要（英）：

The invention discloses a mechanical arm path planning method based on velocity smoothing deterministic policy gradient. The method comprises the steps that a mechanical arm simulation environment with job task feedback is established in a training stage; a previous step mechanical arm action vector is introduced during inputting of a deterministic policy gradient network, and a reinforced learning network framework based on the velocity smoothing deterministic policy gradient is established; network training parameters and the mechanical arm simulation environment are initialized; and samplesare obtained based on the velocity smoothing deterministic policy gradient network and the simulation environment, a training sample database is established, if the training sample quantity reaches the maximum sample quantity, training samples are drawn from the training sample database according to the single time training sample quantity, the velocity smoothing deterministic policy gradient network is trained, and otherwise, next step or the next time of simulation is performed. According to the mechanical arm path planning method provided by the invention, the previous step velocity vectoris added as the network input on the basis of the deterministic policy gradient network, the joint acceleration is effectively decreased, and mechanical arm jitter is reduced.

公开/授权文献

CN110328668B 基于速度平滑确定性策略梯度的机械臂路径规划方法公开/授权日：2022-03-22

信息查询

中国专利公布公告 Global Dossier Espacenet

IPC分类:

B	作业；运输
B25	手动工具；轻便机动工具；手动器械的手柄；车间设备；机械手
B25J	机械手；装有操纵装置的容器（单独采摘水果、蔬菜、啤酒花或类似作物的自动装置入A01D46/30；外科用的针头操纵器入A61B17/062；与滚轧机有关的机械手入B21B39/20；与锻压机有关的机械手入B21J13/10；夹持轮子或其部件的装置入B60B30/00；起重机入B66C；用于核反应堆中所用的燃料或其他材料的处理设备入G21C19/00；机械手与加有防辐射的小室或房间的组合结构入G21F7/06）
B25J9/00	程序控制机械手
B25J9/16	.程序控制（全面生产控制，即集中控制多台机器入G05B19/418）