our paper: Efficient Slip Detection in Robotic Grasping Using Accelerated TranSformers and Multimodal Sensor Fusion
Abstract: We present introduces VTF-AVIT, an innovative TranSformer-based model that integrates tactile and visual sensors to enhance slip detection in robotic grasping. By adopting a novel token reorganization strategy, VTF-AVIT substantially improves computational efficiency while enhancing accuracy. The model excels in dynamic and uncertain environments, providing robust and reliable grasping strategies critical for real-time applications in autonomous robotics. Experimental results demonstrate that VTF-AVIT outperforms traditional TranSformer and CNN-LSTM models in both accuracy and computational efficiency. Furthermore, the fusion of tactile and visual data allows the system to effectively adapt to varying environmental conditions and object characteristics, thereby advancing the capabilities of robots in complex manipulation tasks.
Requirements:
- python3 >= 3.8.10
- numpy >= 1.21.1
- pytorch >= 1.9.0+cpu (cpu training&testing) or 1.9.0+cu102 (cuda training&testing)
- opencv >= 4.5.3
- yaml >= 5.4.1
- json >= 2.0.9
- matplotlib >= 3.4.2
Train & Test
- We used a dataset from this paper: https://arxiv.org/abs/1802.10153, please download the slip detection data yourself.
- python main.py
- python test.py