학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Deep Robotic Grasping Prediction with Hierarchical RGB-D Fusion

Resource Type: Article
Authors: Yaoxian Song; Jun Wen; Dongfang Liu; Changbin Yu
Source: (2022): 243-254.
Subject
Language: Korean
ISSN: 15986446

Online Access

초록

Vision-based robotic grasping is a fundamental task in robotic control. Dexterous and precise grasp control of the robotic arm is challenging and a critical technique for the manufacturing and emerging robot service industry. Current state-of-art methods adopt RGB-D images or point clouds in an attempt to obtain an accurate, robust, and real-time policy. However, most of these methods only use single modal data or ignore the uncertainty of sampling data especially the depth information. Even they leverage multi-modal data, they seldom fuse the features in different scales. All of these results in unreliable grasp prediction inevitably. In this paper, we propose a novel multi-modal neural network to predict grasps in real-time. The key idea is to fuse RGB and depth information hierarchically and quantify the uncertainty of raw depth data to re-weight the depth features. For higher grasping performance, a background extraction module and depth re-estimation module are used to reduce the influence caused by the incompletion and low-quality of the raw data. We evaluate the performance on the Cornell Grasp Dataset and provide a series of extensive experiments to demonstrate the advantages of our method on a real robot. The results indicate the superiority of our proposed method by outperforming the state-of-the-art methods significantly in all metrics.
Vision-based robotic grasping is a fundamental task in robotic control. Dexterous and precise grasp control of the robotic arm is challenging and a critical technique for the manufacturing and emerging robot service industry. Current state-of-art methods adopt RGB-D images or point clouds in an attempt to obtain an accurate, robust, and real-time policy. However, most of these methods only use single modal data or ignore the uncertainty of sampling data especially the depth information. Even they leverage multi-modal data, they seldom fuse the features in different scales. All of these results in unreliable grasp prediction inevitably. In this paper, we propose a novel multi-modal neural network to predict grasps in real-time. The key idea is to fuse RGB and depth information hierarchically and quantify the uncertainty of raw depth data to re-weight the depth features. For higher grasping performance, a background extraction module and depth re-estimation module are used to reduce the influence caused by the incompletion and low-quality of the raw data. We evaluate the performance on the Cornell Grasp Dataset and provide a series of extensive experiments to demonstrate the advantages of our method on a real robot. The results indicate the superiority of our proposed method by outperforming the state-of-the-art methods significantly in all metrics.

공지

DAU Library

학술논문

요약정보

Deep Robotic Grasping Prediction with Hierarchical RGB-D Fusion

Online Access

초록