Skip to content

A comprehensive list of papers about Robot Manipulation, including papers, codes, and related websites.

License

Notifications You must be signed in to change notification settings

BaiShuanghao/Awesome-Robotics-Manipulation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 

Repository files navigation

Awesome-Robotics-Manipulation

✨ About

This repo contains a curated list of Robot Manipulation papers relating to Robotics domain.

Please feel free to send pull requests or email me to add papers! This version of the repository may have some typos, so don’t hesitate to contact me for corrections!

🏠 Table of Contents

📝 Awesome Papers

📄 Survey

Title Venue Date Code Notes
A Survey of Embodied Learning for Object-Centric Robotic Manipulation arXiv 2024-08-21 Star Github Manipulation
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI arXiv 2024-07-09 Star Github Embodied Agent
A Survey on Vision-Language-Action Models for Embodied AI arXiv 2024-05-23 - VLA Models
Survey of Learning-based Approaches for Robotic In-Hand Manipulation arXiv 2024-01-15 - In-hand Manipulation
Language-conditioned Learning for Robotic Manipulation: A Survey arXiv 2023-12-17 Star Github Manipulation
Deep Learning Approaches to Grasp Synthesis: A Review T-RO 2023 2023-07-06 Project Grasp

(back to top)

🦾 Grasp

Rectangle-based Grasp

Title Venue Date Code
HMT-Grasp: A Hybrid Mamba-Transformer Approach for Robot Grasping in Cluttered Environments arXiv 2024-10-04 -
LLGD: Lightweight Language-driven Grasp Detection using Conditional Consistency Model IROS 2024 2024-07-25 Star Github
grasp_det_seg_cnn: End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB ICRA 2021 2021-07-12 Star Github
GR-ConvNet: Antipodal Robotic Grasping using Generative Residual Convolutional Neural Network IROS 2020 2019-09-11 Star Github

(back to top)

6-DoF Grasp

Title Venue Date Code
Real-to-Sim Grasp: Rethinking the Gap between Simulation and Real World in Grasp Detection CoRL 2024 2024-10-09 Project
OrbitGrasp: SE(3)-Equivariant Grasp Learning CoRL 2024 2024-07-03 Project
EquiGraspFlow: SE(3)-Equivariant 6-DoF Grasp Pose Generative Flows CoRL 2024 - Star Github
EconomicGrasp: An Economic Framework for 6-DoF Grasp Detection ECCV 2024 2024-07-11 Star Github
Generalizing 6-DoF Grasp Detection via Domain Prior Knowledge CVPR 2024 2024-04-02 Star Github
FlexLoG: Rethinking 6-Dof Grasp Detection: A Flexible Framework for High-Quality Grasping arXiv 2024 2024-03-22 -
HGGD: Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes RA-L 2023 2024-03-27 Star Github
AnyGrasp: Robust and Efficient Grasp Perception in Spatial and Temporal Domains T-RO 2023 2022-12-16 Github
GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping CVPR 2020 2020 Star Github
6-DOF GraspNet: Variational Grasp Generation for Object Manipulation ICCV 2019 2019-05-25 -

(back to top)

Grasp with 3D Techniques

Title Venue Date Code
SDF
IGD: Implicit Grasp Diffusion: Bridging the Gap between Dense Prediction and Sampling-based Grasping CoRL 2024 - Gitlab
NeuGraspNet: Learning Any-View 6DoF Robotic Grasping in Cluttered Scenes via Neural Surface Rendering RSS 2024 2023-06-12 -
NeRF
LERF-TOGO: Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping CoRL 2023 2023-09-14 Star Github
GraspNeRF: Multiview-based 6-DoF Grasp Detection for Transparent and Specular Objects Using Generalizable NeRF ICRA 2023 2022-10-12 Star Github
3D Gaussian Splatting (3DGS)
SparseGrasp: Robotic Grasping via 3D Semantic Gaussian Splatting from Sparse Multi-View RGB Images arXiv 2024-12-03 -
GraspSplats: Efficient Manipulation with 3D Feature Splatting CoRL 2024 2024-09-03 Star Github
GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping RA-L 2024 2024-03-14 Star Github

(back to top)

Language-Driven Grasp

Title Venue Date Code
RTAGrasp: Learning Task-Oriented Grasping from Human Videos via Retrieval, Transfer, and Alignment arXiv 2024-09-24 Project
LGrasp6D: Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance ECCV 2024 2024-07-18 Star Github
Reasoning Grasping: Reasoning Grasping via Multimodal Large Language Model CoRL 2024 2024-02-09 Project
ThinkGrasp: A Vision-Language System for Strategic Part Grasping in Clutter CoRL 2024 2024-07-16 Star Github
OWG: Towards Open-World Grasping with Large Vision-Language Models CoRL 2024 2024-06-26 Project
RT-Grasp: Reasoning Tuning Robotic Grasping via Multi-modal Large Language Model IROS 2024 2024-11-07 Project

(back to top)

Grasp for Transparent Objects

Title Venue Date Code
T2SQNet: A Recognition Model for Manipulating Partially Observed Transparent Tableware Objects CoRL 2024 - Star Github
ASGrasp: Generalizable Transparent Object Reconstruction and Grasping from RGB-D Active Stereo Camera ICRA 2024 2024-05-09 Star Github
Dex-NeRF: Using a Neural Radiance Field to Grasp Transparent Objects CoRL 2021 2021-10-27 -

(back to top)

Dexterous Grasp

Title Venue Date Code
Grasp What You Want: Embodied Dexterous Grasping System Driven by Your Voice arXiv 2024-12-14 Project
UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping arXiv 2024-12-03 Star Github

(back to top)

🤖 Manipulation

Representation Learning with Auxiliary Tasks

Title Venue Date Code
Contrastive Learning (Alignment)
Σ-agent: Contrastive Imitation Learning for Language-guided Multi-Task Robotic Manipulation CoRL 2024 2024-06-14 Project
Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers RSS 2024 2024-03-19 Project
R3M: A Universal Visual Representation for Robot Manipulation CoRL 2022 2022-03-23 Star Github
HULC: What Matters in Language Conditioned Robotic Imitation Learning over Unstructured Data RA-L 2022 2022-04-13 Star Github
BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning CoRL 2021 2022-02-04 Star Github
Masked Reconstruction
STP: Spatiotemporal Predictive Pre-training for Robotic Motor Control arXiv 2024-03-08 -
MUTEX: Learning Unified Policies from Multimodal Task Specifications CoRL 2023 2023-09-25 Star Github
Robot Learning with Sensorimotor Pre-training CoRL 2023 2023-06-16 Project
Voltron: Language-Driven Representation Learning for Robotics RSS 2023 2023-02-24 Star Github
MVP: Real-World Robot Learning with Masked Visual Pre-training CoRL 2022 2022-10-06 Star Github
Text Goal Generation
RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning arXiv 2024-09-23 Star Github
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought NeurIPS 2023 2023-05-24 Star Github
COTPC: Chain-of-Thought Predictive Control ICML 2024 2023-04-03 Star Github
Visual Goal Generation
VIRT: Vision Instructed Transformer for Robotic Manipulation arXiv 2024-10-09 Star Github
KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance CoRL 2024 2024-08-06 Star Github
GENIMA: Generative Image as Action Models CoRL 2024 2024-07-10 Star Github
ATM: Any-point Trajectory Modeling for Policy Learning RSS 2024 2023-12-28 Star Github
MPI: Learning Manipulation by Predicting Interaction RSS 2024 2024-06-01 Star Github
OCI: Object-Centric Instruction Augmentation for Robotic Manipulation ICRA 2024 2024-01-05 Project
HOPMan: Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans ICRA 2024 2023-12-01 Project
CALAMARI: Contact-Aware and Language conditioned spatial Action MApping for contact-RIch manipulation CoRL 2023 2023 Project
Image / Video Prediction
Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation arXiv 2024-12-19 Star Github
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations arXiv 2024-12-19 Project
GHIL-Glue: Hierarchical Control with Filtered Subgoal Images arXiv 2024-10-26 Project
FoAM: Foresight-Augmented Multi-Task Imitation Policy for Robotic Manipulation arXiv 2024-09-29 Project
VideoAgent: Self-Improving Video Generation arXiv 2024-10-14 Star Github
GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy arXiv 2024-08-26 Star Github
GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation arXiv 2024-10-08 Project
VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation RSS 2024 2024-07-13 Star Github
GR-1: Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation ICLR 2024 2023-12-20 Star Github
SuSIE: Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models ICLR 2024 2023-10-16 Star Github
VLP: Video Language Planning ICLR 2024 2023-10-16 Github

(back to top)

Visual Representation Learning

Title Venue Date Code
MCR: Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets arXiv 2024-10-29 Star Github
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation arXiv 2024-10-10 Star Github
CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation NeurIPS 2024 2024-09-13 Star Github
Theia: Distilling Diverse Vision Foundation Models for Robot Learning CoRL 2024 2024-07-29 Star Github
MPI: Learning Manipulation by Predicting Interaction RSS 2024 2024-06-01 Star Github
VC-1: Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence? NeurIPS 2023 2023-03-31 Star Github
MVP: Real-World Robot Learning with Masked Visual Pre-training CoRL 2023 2022-10-06 Star Github
LIV: Language-Image Representations and Rewards for Robotic Control ICML 2023 2023-06-01 Star Github
VIMA: General Robot Manipulation with Multimodal Prompts ICML 2023 2022-10-06 Star Github
ACT: Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware RSS 2023 2023-04-23 Star Github
Voltron: Language-Driven Representation Learning for Robotics RSS 2023 2023-02-24 Star Github
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training ICLR 2023 2022-08-30 Star Github
R3M: A Universal Visual Representation for Robot Manipulation CoRL 2022 2022-03-23 Star Github
ZeST: Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation? L4DC 2022 2022-04-23 Project

(back to top)

Multimodal Representation Learning

Title Venue Date Code
MS-Bot: Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation arXiv 2024-08-02 Star Github
MUTEX: Learning Unified Policies from Multimodal Task Specifications CoRL 2023 2023-09-25 Star Github

(back to top)

Latent Action Learning

Title Venue Date Code
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation arXiv 2024-12-05 Star Github
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation arXiv 2024-09-27 Project
IGOR: Image-GOal Representations Atomic Control Units for Foundation Models in Embodied AI - 2024 Project
LAPA: Latent Action Pretraining from Videos arXiv 2024-10-15 Star Github
GRIF: Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control CoRL 2023 2023-06-30 Star Github
MimicPlay: Long-Horizon Imitation Learning by Watching Human Play CoRL 2023 2023-02-24 Star Github
KOAP: Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers arXiv 2024-10-24 -
LAPO: Learning to Act without Actions ICLR 2024 2023-12-17 Star Github
ILPO: Imitating Latent Policies from Observation ICML 2019 2018-05-21 Star Github

(back to top)

World Model

Title Venue Date Code
Sirius-Fleet: Multi-Task Interactive Robot Fleet Learning with Visual World Models CoRL 2024 2024-10-30 Project
MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning CoRL 2023 2024-01-06 Project
FOWM: Finetuning Offline World Models in the Real World CoRL 2023 2023-10-24 Star Github
SWIM: Structured World Models from Human Videos RSS 2023 2023-08-23 Project
Surfer: Progressive Reasoning with World Models for Robotic Manipulation arXiv 2023-06-20 Star Github

(back to top)

Asynchronous Action Learning

Title Venue Date Code
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation NeurIPS 2024 2024-10-14 Project
HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers arXiv 2024-09-12 -
MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models CoRL 2023 2024-01-25 Star Github

(back to top)

Diffusion Policy Learning

Title Venue Date Code
AffordDP: Generalizable Diffusion Policy with Transferable Affordance arXiv 2024-12-04 Project
Instant Policy: In-Context Imitation Learning via Graph Diffusion arXiv 2024-11-19 Star Github
STMDP: Brain-inspired Action Generation with Spiking Transformer Diffusion Policy Model arXiv 2024-11-15 -
MBA: Motion Before Action: Diffusing Object Motion as Manipulation Condition arXiv 2024-11-14 Star Github
DiT Policy: Diffusion Transformer Policy arXiv 2024-10-21 -
CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation arXiv 2024-10-19 Project
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation arXiv 2024-10-10 Star Github
ScaleDP: Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation arXiv 2024-09-22 Project
SDP: Spiking Diffusion Policy for Robotic Manipulation with Learnable Channel-Wise Membrane Thresholds arXiv 2024-09-17 -
DiT-Block Policy: The Ingredients for Robotic Diffusion Transformers arXiv 2024-10-14 Star Github
GenDP: 3D Semantic Fields for Category-Level Generalizable Diffusion Policy CoRL 2024 2024-10-23 Star Github
EquiBot: SIM(3)-Equivariant Diffusion Policy for Generalizable and Data Efficient Learning CoRL 2024 2024-07-01 Star Github
SDP: Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning CoRL 2024 2024-07-01 Star Github
RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective IROS 2024 2024-04-18 Star Project
MDT: Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals RSS 2024 2024-07-08 Star Github
R&D: Render and Diffuse: Aligning Image and Action Spaces for Diffusion-based Behaviour Cloning RSS 2024 2024-05-28 Star Github
DP3: 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations RSS 2024 2024-03-06 Star Github
PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play CoRL 2023 2023-12-07 Project
EquiDiff: Equivariant Diffusion Policy CoRL 2024 2024-07-01 Star Code
StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects RSS 2023 2022-11-08 Star Github
BESO: Goal-Conditioned Imitation Learning using Score-based Diffusion Policies RSS 2023 2023-04-05 Star Github
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion RSS 2023 2023-03-07 Star Github

(back to top)

Other Policies

Title Venue Date Code
Score and Distribution Matching Policy: Advanced Accelerated Visuomotor Policies via Matched Distillation arXiv 2024-12-12 Project
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction arXiv 2024-12-09 Star Github
FlowPolicy: Enabling Fast and Robust 3D Flow-based Policy via Consistency Flow Matching for Robot Manipulation arXiv 2024-12-06 Star Github
Autoregressive Action Sequence Learning for Robotic Manipulation arXiv 2024-10-04 Star Github
MaIL: Improving Imitation Learning with Selective State Space Models CoRL 2024 2024-06-12 -

(back to top)

Vision Language Action Models

Title Venue Date Code
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation arXiv 2024-12-29 Star Github
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Modelsg arXiv 2024-12-18 Star Github
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning arXiv 2024-12-16 Star Github
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies arXiv 2024-12-13 -
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression arXiv 2024-12-14 Project
π0 : A Vision-Language-Action Flow Model for General Robot Control arXiv 2024-10-31 Project
BYOVLA: Run-time Observation Interventions Make Vision-Language-Action Models More Visually Robust arXiv 2024-10-02 Star Github
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation arXiv 2024-09-19 Star Github
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution NeurIPS 2024 2024-11-04 Github
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation NeurIPS 2024 2024-06-06 Star Github
DP-VLA: A Dual Process VLA: Efficient Robotic Manipulation Leveraging VLM CoRL 2024 2024-10-21 -
OpenVLA: An Open-Source Vision-Language-Action Model CoRL 2024 2024-06-13 Star Github
LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning CoRL 2024 2024-06-17 Star Github
ECoT: Robotic Control via Embodied Chain-of-Thought Reasoning CoRL 2024 2024-07-11 Star Github
3D-VLA: A 3D Vision-Language-Action Generative World Model ICML 2024 2024-03-14 Star Github
Octo: An Open-Source Generalist Robot Policy RSS 2024 2024-05-20 Star Github
RoboFlamingo: Vision-Language Foundation Models as Effective Robot Imitators ICLR 2024 2023-11-02 Star Github
RT-H: Action Hierarchies Using Language arXiv 2024-03-04 Project
Open X-Embodiment: Robotic Learning Datasets and RT-X Models ICRA 2024 2023-10-13 Star Github
MOO: Open-World Object Manipulation using Pre-trained Vision-Language Models CoRL 2023 2023-03-02 Project
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control CoRL 2023 2023-07-28 Project
RT-1: Robotics Transformer for Real-World Control at Scale RSS 2023 2022-12-13 Star Github

(back to top)

Reinforcement Learning

Title Venue Date Code
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model arXiv 2024-12-18 Star Github
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning arXiv 2024-12-13 Project
HIL-SERL: Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning arXiv 2024-10-29 Project
PointPatchRL - Masked Reconstruction Improves Reinforcement Learning on Point Clouds CoRL 2024 2024-10-24 Project
SPIRE: Synergistic Planning, Imitation, and Reinforcement for Long-Horizon Manipulation CoRL 2024 2024-10-23 Project
Maniwhere: Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning CoRL 2024 2024-07-22 Project
PSL: Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks ICLR 2024 2024-05-02 Star Github
TD-MPC2: Scalable, Robust World Models for Continuous Control ICLR 2024 2023-10-25 Star Github
VELAP: Expansive Latent Planning for Sparse Reward Offline Reinforcement Learning CoRL 2023 2023 -
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions CoRL 2023 2023-09-18 Project
PTR: Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials RSS 2023 2022-10-11 Project
TD-MPC: Temporal Difference Learning for Model Predictive Control ICML 2022 2022-03-09 Star Github

(back to top)

Motion, Tranjectory and Flow

Title Venue Date Code
Path Planning
LACO: Language-Conditioned Path Planning CoRL 2023 2024-08-31 Star Github
Motion Planning
DiffusionSeeder: Seeding Motion Optimization with Diffusion for Rapid Motion Planning CoRL 2024 2024-10-22 Project
ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation CoRL 2024 2024-09-03 Star Github
CoPa: General Robotic Manipulation through Spatial Constraints of Parts with Foundation Models ICRAW 2024 2024-03-13 Star Github
Elastic-DS: Task Generalization with Stability Guarantees via Elastic Dynamical System Motion Policies CoRL 2023 2023-09-05 Star Github
Trajectory Optimization
ORION: Vision-based Manipulation from Single Human Video with Open-World Object Graphs arXiv 2024-05-30 Project
PointFlowMatch: Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching CoRL 2024 2024-09-11 Project
RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation ICRA 2024 2023-08-30 Star Github
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models CoRL 2023 2023-07-12 Star Github
LATTE: LAnguage Trajectory TransformEr ICRA 2023 2022-08-04 Star Github
Trajectory-conditioned policy
P3-PO: Prescriptive Point Priors for Visuo-Spatial Generalization of Robot Policies arXiv 2024-12-09 Star Github
Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation ECCV 2024 2024-05-02 Star Github
ATM: Any-point Trajectory Modeling for Policy Learning RSS 2024 2023-12-28 Star Github
AWE: Waypoint-Based Imitation Learning for Robotic Manipulation CoRL 2023 2023-07-26 Star Github
Flow-conditioned policy
Im2Flow2Act: Flow as the Cross-Domain Manipulation Interface CoRL 2024 2024-07-21 Star Github
AVDC: Learning to Act from Actionless Videos through Dense Correspondences ICLR 2024 2023-10-12 Star Github

(back to top)

Data Collection, Selection and Augmentation

Title Venue Date Code
Data Collection
ALPHA-α and Bi-ACT Are All You Need: Importance of Position and Force Information/Control for Imitation Learning of Unimanual and Bimanual Robotic Manipulation with Low-Cost System arXiv 2024-11-15 Project
SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment CoRL 2024 2024-10-24 Project
NILS: Scaling Robot Policy Learning via Zero-Shot Labeling with Foundation Models CoRL 2024 2024-10-23 Project
SOAR: Autonomous Improvement of Instruction Following Skills via Foundation Models CoRL 2024 2024-07-30 Star Github
Manipulate-Anything: Automating Real-World Robots using Vision-Language Models CoRL 2024 2024-06-27 Project
DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation CoRL 2024 2024-03-12 Star Github
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots RSS 2024 2024-02-15 Star Github
AirExo: Low-Cost Exoskeletons for Learning Whole-Arm Manipulation in the Wild ICRA 2024 2023-09-26 Star Github
SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling ICRA 2024 2023-06-20 Star Github
Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition CoRL 2023 2023-07-26 Star Github
DIAL: Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models RSS 2023 2022-11-21 Project
RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation TMLR 2023 2023-06-20 -
Data Selection
AMF: Active Fine-Tuning of Generalist Policies arXiv 2024-10-07 -
Re-Mix: Optimizing Data Mixtures for Large Scale Imitation Learning CoRL 2024 2024-08-26 Star Github
An Unbiased Look at Datasets for Visuo-Motor Pre-Training CoRL 2023 2023-10-13 Star Github
Data Quality in Imitation Learning NeurIPS 2023 2023-06-04 -
Data Retrieval
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning arXiv 2024-12-19 Project
Retrieval-Augmented Embodied Agents CVPR 2024 2024-04-17 -
Behavior Retrieval: Few-Shot Imitation Learning by Querying Unlabeled Datasets RSS 2023 2023-04-08 Star Github
Data Augmentation
RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations arXiv 2024-11-25 Project
RoVi-Aug: Robot and Viewpoint Augmentation for Cross-Embodiment Robot Learning CoRL 2024 2024-09-05 Project
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning CoLLAs 2024 2024-07-30 Project
Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning RSS 2024 2023-02-27 Star Github
ROSIE: Scaling Robot Learning with Semantically Imagined Experience RSS 2023 2023-02-22 Project
GenAug: Retargeting behaviors to unseen situations via Generative Augmentation RSS 2023 2023-02-13 Star Github
Evaluation
Contrast Sets for Evaluating Language-Guided Robot Policies CoRL 2024 2024-06-19 -

(back to top)

Affordance Learning

Title Venue Date Code
Articulated Object Affordance
ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation? arXiv 2024-12-13 -
UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models arXiv 2024-09-16 Project
A3VLM: Actionable Articulation-Aware Vision Language Model CoRL 2024 2024-06-14 Star Github
AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation CoRL 2024 2024-06-17 Project
SAGE: Bridging Semantic and Actionable Parts for Generalizable Manipulation of Articulated Objects RSS 2024 2023-12-03 Star Github
Kinematic-aware Prompting for Generalizable Articulated Object Manipulation with LLMs ICRA 2024 2023-11-06 Star Github
Ditto: Building Digital Twins of Articulated Objects from Interaction CVPR 2022 2022-08-16 Star Github
Part-Based Object Affordance
3DAPNet: Language-Conditioned Affordance-Pose Detection in 3D Point Clouds ICRA 2024 2023-09-19 Star Github
CPM: Composable Part-Based Manipulation CoRL 2023 2024-05-09 Project
PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations CVPR 2023 2023-03-29 Star Github
GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts CVPR 2023 2022-11-10 Star Github
Spatial Affordance
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics arXiv 2024-11-25 -
SpatialBot: Precise Spatial Understanding with Vision Language Models arXiv 2024-06-19 Star Github
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics CoRL 2024 2024-06-15 Star Github
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities CVPR 2024 2024-01-22 Project
Visual Affordance
RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation CoRL 2024 2024-07-05 Star Github
MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting RSS 2024 2024-03-05 Star Github
SLAP: Spatial-Language Attention Policies CoRL 2023 2023-04-21 Star Github
KITE: Keypoint-Conditioned Policies for Semantic Manipulation CoRL 2023 2023-06-29 Project
HULC++: Grounding Language with Visual Affordances over Unstructured Data ICRA 2023 2022-10-04 Star Github
CLIPort: What and Where Pathways for Robotic Manipulation CoRL 2022 2021-09-24 Star Github
VAPO: Affordance Learning from Play for Sample-Efficient Policy Learning ICRA 2022 2022-03-01 Project
Transporter Networks: Rearranging the Visual World for Robotic Manipulation CoRL 2020 2020-10-27 Star Github

(back to top)

3D Representation for Manipulation

Title Venue Date Code
G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation arXiv 2024-11-27 Star Github
MSGField: A Unified Scene Representation Integrating Motion, Semantics, and Geometry for Robotic Manipulation arXiv 2024-10-21 Star Github
Splat-MOVER: Multi-Stage, Open-Vocabulary Robotic Manipulation via Editable Gaussian Splatting arXiv 2024-05-07 Star Github
IMAGINATION POLICY: Using Generative Point Cloud Models for Learning Manipulation Policies CoRL 2024 2024-06-17 Project
Physically Embodied Gaussian Splatting: A Realtime Correctable World Model for Robotics CoRL 2024 2024-06-16 Project
RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation CoRL 2024 2024-03-28 Star Github
RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation CoRL 2024 2024-02-23 Star Github
D3Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Rearrangement CoRL 2024 2023-09-28 Star Github
Object-Aware Gaussian Splatting for Robotic Manipulation ICRAW 2024 - Project
F3RM: Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation CoRL 2023 2023-07-27 Star Github
R-NDF: SE(3)-Equivariant Relational Rearrangement with Neural Descriptor Fields CORL 2022 2022-11-17 Star Github
NDF: Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation ICRA 2022 2021-12-09 Star Github

(back to top)

3D Representation Policy Learning

Title Venue Date Code
Diffusion Policy (DP)
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation arXiv 2024-09-30 Project
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations CoRL 2024 2024-02-16 Star Github
DP3: 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations RSS 2024 2024-03-06 Star Github
Reconstruction
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation arXiv 2024-11-27 Star Github
ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation ECCV 2024 2024-03-13 Star Github
SGRv2: Leveraging Locality to Boost Sample Efficiency in Robotic Manipulation CoRL 2024 2024-06-15 Star Github
RVT-2: Learning Precise Manipulation from Few Demonstrations RSS 2024 2024-01-12 Star Github
GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields CoRL 2023 2023-08-31 Star Github
3D4RL: Visual Reinforcement Learning with Self-Supervised 3D Representations RA-L 2023 2022-10-13 Star Github
PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation CoRL 2023 2023-09-27 Star Github
M2T2: Multi-Task Masked Transformer for Object-centric Pick and Place CoRL 2023 2023-11-02 Star Github
PerAct: Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation CoRL 2022 2022-09-12 Star Github
Visual Goal Generation
3D-MVP: 3D Multiview Pretraining for Robotic Manipulation CoRL 2024 2024-06-26 Project
ActAIM2: Discovering Robotic Interaction Modes with Discrete Representation Learning CoRL 2024 2024-10-26 Project
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation ICML 2024 2024-05-30 Star Github
RVT: Robotic View Transformer for 3D Object Manipulation CoRL 2023 2023-06-26 Star Github
GROOT: Learning Generalizable Manipulation Policies with Object-Centric 3D Representations CoRL 2023 2023-10-22 Star Github
others
SPHINX: What's the Move? Hybrid Imitation Learning via Salient Points arXiv 2024-12-06 Star Github
SGR: A Universalc Semantic-Geometric Representation for Robotic Manipulation CoRL 2023 2023-06-18 Star Github

(back to top)

Reasoning, Planning and Code Generation

Title Venue Date Code
Task Planning
MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation arXiv 2024-11-26 Project
Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following arXiv 2024-04-21 -
Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models IROS 2024 2024-08-15 Project
PG-InstructBLIP: Physically Grounded Vision-Language Models for Robotic Manipulation ICRA 2024 2023-09-05 Project
RoCo: Dialectic Multi-Robot Collaboration with Large Language Models ICRA 2024 2023-07-10 Star Github
REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction CoRL 2023 2023-06-27 Star Github
Saycan: Do As I Can, Not As I Say: Grounding Language in Robotic Affordances CoRL 2023 2022-04-04 Star Github
LLM+P: Empowering Large Language Models with Optimal Planning Proficiency arXiv 2023-04-22 Star Github
Inner Monologue: Embodied Reasoning through Planning with Language Models CoRL 2022 2022-07-12 Project
SHOWTELL: Teaching Robots with Show and Tell: Using Foundation Models to Synthesize Robot Policies from Language and Visual Demonstrations CoRL 2024 - Project
GIRAF: Gesture-Informed Robot Assistance via Foundation Models CoRL 2023 2023-09-06 Project
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models ICCV 2023 2022-12-08 Star Github
Code Generation
Demo2Code: From Summarizing Demonstrations to Synthesizing Code via Extended Chain-of-Thought NeurIPS 2023 2023-05-26 Project
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model arXiv 2023-05-18 Star Github
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models ICRA 2023 2022-09-22 Star Github
ChatGPT for Robotics: Design Principles and Model Abilities IEEE Access 2023 2023-02-20 Star Github
Code as Policies: Language Model Programs for Embodied Control ICRA 2023 2022-09-16 Star Github
TidyBot: Personalized Robot Assistance with Large Language Models Autonomous Robots 2023 2023-05-09 Star Github
Statler: State-Maintaining Language Models for Embodied Reasoning ICRA 2024 2023-06-30 Star Github
InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning RSS 2024 2023-05-30 Star Github
Text2Motion: From Natural Language Instructions to Feasible Plans Autonomous Robots 2023 2023-03-21 Project
Multimodal Reasoning
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection arXiv 2024-12-05 Project
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation arXiv 2024-10-01 Project
λ-Repformer: Task Success Prediction for Open-Vocabulary Manipulation Based on Multi-Level Aligned Representations CoRL 2024 2024-10-01 Project
ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation CVPR 2024 2023-12-24 Star Github
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought NeurIPS 2023 2023-05-24 Star Github
Matcha: Chat with the Environment: Interactive Multimodal Perception Using Large Language Models IROS 2023 2023-03-14 Star Github
PaLM-E: An Embodied Multimodal Language Model ICML 2023 2023-03-06 Star Github
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language ICLR 2023 2022-04-01 Project

(back to top)

Generalization

Title Venue Date Code
Generalization using Data
Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting RSS 2024 2024-02-29 Star Github
Decomposing the Generalization Gap in Imitation Learning for Visual Robotic Manipulation ICRA 2024 2024-02-29 Star Github
Compositional Generalization
Policy Architectures for Compositional Generalization in Control NeurIPSW 2022 2022-03-10 Star Github
PROGRAMPORT: Programmatically Grounded, Compositionally Generalizable Robotic Manipulation ICLR 2023 2023-04-26 Project
Efficient Data Collection for Robotic Manipulation via Compositional Generalization RSS 2024 2024-03-08 Project
Sim2Real Generalization
Natural Language Can Help Bridge the Sim2Real Gap RSS 2024 2024-05-16 Star Github
RialTo: Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation RSS 2024 2024-03-06 Star Github
Domain Randomization: Sim-to-Real Transfer of Robotic Control with Dynamics Randomization ICRA 2018 2017-10-18
Generalization for Long-horizon and Complex Task
ManipGen: Local Policies Enable Zero-shot Long-horizon Manipulation arXiv 2024-10-29 Project
TBBF: A Backbone for Long-Horizon Robot Task Understanding arXiv 2024-08-02 Project
STAP: Sequencing Task-Agnostic Policies ICRA 2023 2022-10-21 Star Github
BOSS: Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance CoRL 2023 2023-12-16 Star Github
BLADE: Learning Compositional Behaviors from Demonstration and Language CoRL 2024 2024 -
PALO: Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation CoRL 2024 2024-08-29 Star Github
Few-shot
Learning Generalizable 3D Manipulation With 10 Demonstrations arXiv 2024-11-15 Star Github

(back to top)

Generalist

Title Venue Date Code
Generalist with Different Embodiment Types
CrossFormer: Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation CoRL 2024 2024-08-21 Star Github
ARIO: All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents arXiv 2024-08-20 Project
HPT: Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers NeurIPS 2024 2024-09-30 Star Github
Generalist in Different Embodied Tasks
LEO: An Embodied Generalist Agent in 3D World ICML 2024 2023-11-18 Star Github
Manipulation Generalist
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning arXiv 2024-12-13 Project
RoboMM: All-in-One Multimodal Large Model for Robotic Manipulation arXiv 2024-12-10 Star Github
RoboDual: Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation arXiv 2024-10-10 Project
Effective Tuning Strategies for Generalist Robot Manipulation Policies arXiv 2024-10-02 -
Octo: An Open-Source Generalist Robot Policy RSS 2024 2024-05-20 Star Github
V-GPS: Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance CoRL 2024 2024-10-17 Project
Open X-Embodiment: Robotic Learning Datasets and RT-X Models ICRA 2024 2023-10-13 Star Github
RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking ICRA 2024 2023-09-05 Star Github
Maniwhere: Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning CoRL 2024 2024-07-22 Project
CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation arXiv 2024-10-19 Project
Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments arXiv 2024-09-09 Github
More for VLAs

(back to top)

Human-Robot Interaction and Collaboration

Title Venue Date Code
Maximizing Alignment with Minimal Feedback: Efficiently Learning Rewards for Visuomotor Robot Policy Alignment arXiv 2024-12-06 Project
Vocal Sandbox: Continual Learning and Adaptation for Situated Human-Robot Collaboration CoRL 2024 - Project
APRICOT: Active Preference Learning and Constraint-Aware Task Planning with LLMs CoRL 2024 - Project
Text2Interaction: Establishing Safe and Preferable Human-Robot Interaction CoRL 2024 2024-08-12 Star Github
KNOWNO: Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners CoRL 2023 2023-07-04 Github
LILAC: Yell At Your Robot: Improving On-the-Fly from Language Corrections arXiv 2024-03-19 Star Github
YAY Robot: "No, to the Right" -- Online Language Corrections for Robotic Manipulation via Shared Autonomy HRI 2023 2023-01-06 Star Github

(back to top)

Mobile Manipulation

Title Venue Date Code
Robi Butler: Remote Multimodal Interactions with Household Robot Assistant arXiv 2024-09-30 Project
TaMMa: Target-driven Multi-subscene Mobile Manipulation CoRL 2024 - -
SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning CoRL 2023 2024-07-12 Project
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation CoRL 2024 2024-01-04 Star Github
GAMMA: Graspability-Aware Mobile MAnipulation Policy Learning based on Online Grasping Pose Fusion ICRA 2024 2023-09-27 Star Github

(back to top)

Tactile-based Manipulation

Title Venue Date Code
Digitizing Touch with an Artificial Multimodal Fingertip - 2024-10-31 Star Github
Sparsh: Self-supervised touch representations for vision-based tactile sensing CoRL 2024 2024 Star Github
MimicTouch: Leveraging Multi-modal Human Tactile Demonstrations for Contact-rich Manipulation CoRL 2024 2023-10-25 Project
Octopi: Object Property Reasoning with Large Tactile-Language Models RSS 2024 2024-05-05 Star Github
RoboPack: Learning Tactile-Informed Dynamics Models for Dense Packing RSS 2024 2024-07-01 Project
RotateIt: General In-Hand Object Rotation with Vision and Touch CoRL 2023 2023-09-18 Project
T-DEX: Dexterity from Touch: Self-Supervised Pre-Training of Tactile Representations with Robotic Play CoRL 2023 2023-03-21 Star Github

(back to top)

Dexterous Manipulation

Title Venue Date Code
D(R, O) Grasp: A Unified Representation of Robot and Object Interaction for Cross-Embodiment Dexterous Grasping arXiv 2024-10-02 Star Github
Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning arXiv 2024-07-03 Star Github
DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes CoRL 2024 2024 -
DexGraspNet: A Large-Scale Robotic Dexterous Grasp Dataset for General Objects Based on Simulation ICRA 2023 2022-10-06 Star Github
Demonstrating Learning from Humans on Open-Source Dexterous Robot Hands RSS 2024 2024 -
CyberDemo: Augmenting Simulated Human Demonstration for Real-World Dexterous Manipulation CVPR 2024 2024-02-22 Star Github
Dexterous Functional Grasping CoRL 2023 2023-12-05 Project
DEFT: Dexterous Fine-Tuning for Real-World Hand Policies CoRL 2023 2023-10-30 Project
REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation CoRL 2023 2023-09-06 Project
Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation CoRL 2023 2023-09-02 Star Github
AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System RSS 2023 2023-07-10 Project

(back to top)

Other Applications

Title Venue Date Code
Deformable Object Manipulation
HANDLOOM: Learned Tracing of One-Dimensional Objects for Inspection and Manipulation CoRL 2023 2023-03-15 Project
Contact-rich Manipulation
FoAR: Force-Aware Reactive Policy for Contact-Rich Robotic Manipulation arXiv 2024-11-24 Project
ForceMimic: Force-Centric Imitation Learning with Force-Motion Capture System for Contact-Rich Manipulation arXiv 2024-10-10 Project
Stowing Tasks
Predicting Object Interactions with Behavior Primitives: An Application in Stowing Tasks CoRL 2023 2023-09-28 Star Github
Object Rearrangement
PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement WACV 2025 2024-10-29 -
LGMCTS: Language-Guided Monte-Carlo Tree Search for Executable Semantic Object Rearrangement IROS 2024 2023-09-27 Star Github
LLM-GROP: Task and Motion Planning with Large Language Models for Object Rearrangement IROS 2023 2023-03-10 Colab
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics RA-L 2023 2022-10-05 Project
Human-to-Robot Handover
GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation CVPR 2024 2024-01-01 Star Github
Cook
RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools CoRL 2023 2023-06-26 Star Github
Non-prehensile Manipulation
HACMan: Learning Hybrid Actor-Critic Maps for 6D Non-Prehensile Manipulation CoRL 2023 2023-05-06 Star Github
Feed
VAPORS: Learning Sequential Acquisition Policies for Robot-Assisted Feeding CoRL 2023 2023-09-11 Project
Tool Manipulation
Leveraging Language for Accelerated Learning of Tool Manipulation CoRL 2023 2022-06-27 Star Github
Responsible Manipulation
Don't Let Your Robot be Harmful: Responsible Robotic Manipulation arXiv 2024-11-27 Star Github
TrojanRobot: Backdoor Attacks Against LLM-based Embodied Robots in the Physical World arXiv 2024-11-18 -

(back to top)

📊 Awesome Benchmarks

Grasp Datasets

Title Venue Date Code
QDGset: A Large Scale Grasping Dataset Generated with Quality-Diversity arXiv 2024-10-03 -
Real-to-Sim Grasp: Rethinking the Gap between Simulation and Real World in Grasp Detection CoRL 2024 2024-10-09 Project
Grasp-Anything-6D: Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance ECCV 2024 2024-07-18 Star Github
Grasp-Anything++: Language-driven Grasp Detection CVPR 2024 2024-06-13 Star Github
Grasp-Anything: Large-scale Grasp Dataset from Foundation Models ICRA 2024 2023-09-18 Star Github
GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping CVPR 2020 2020 Star Github

(back to top)

Manipulation Benchmarks

Title Venue Date Code
Manipulation in Home Environment
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots RSS 2024 2024-06-04 Star Github
ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes ICCV 2023 2023-04-09 Star Github
HomeRobot: Open-Vocabulary Mobile Manipulation CoRL 2023 2023-06-20 Star Github
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks CVPR 2020 2019-12-03 Star Github
Manipulation in On-Table Environment
Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy arXiv 2024-10-02 Star Github
OBSBench: Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning NeuIPS 2024 2024-02-04 Star Github
GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs CoRL 2024 2024-10-04 Star Github
THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation RSS 2024 2024-02-13 Star Github
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning NeurIPS 2023 2023-06-05 Star Github
VIMA: General Robot Manipulation with Multimodal Prompts ICML 2023 2022-10-06 Star Github
CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks RA-L 2021 2021-12-06 Star Github
RLBench: The Robot Learning Benchmark & Learning Environment RA-L 2020 2019-09-26 Star Github
KitchenShift: Evaluating Zero-Shot Generalization of Imitation-Based Policy Learning Under Domain Shifts NeurIPSW 2021 2021 Star Github
Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning CoRL 2019 2019-10-24 Star Github
Franka-Kitchen: Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning CoRL 2019 2019-10-25 -
Evaluating Real-World Robot Manipulation Policies in Simulation CoRL 2024 2024-05-09 Star Github
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation arXiv 2024-10-07 -
ClutterGen: A Cluttered Scene Generator for Robot Learning CoRL 2024 2024-07-07 Star Github
Tactile Manipulation
Efficient Tactile Simulation with Differentiability for Robotic Manipulation CoRL 2022 2022 Star Github
Functional Manipulation
FMB: a Functional Manipulation Benchmark for Generalizable Robotic Learning IJRR 2024 2024-01-16 Star Github
Robot Trajectory Datasets
Open X-Embodiment: Robotic Learning Datasets and RT-X Models ICRA 2024 2023-10-13 Star Github
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset ICRA 2024 2024-03-19 Project
BridgeData V2: A Dataset for Robot Learning at Scale CoRL 2023 2024-08-24 Star Github
RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot RSSW 2023 2023-07-02 Project
Embodied QA Datasets
ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models IROS 2024 2024-03-17 Star Github
OpenEQA: Embodied Question Answering in the Era of Foundation Models CVPR 2024 2024 Star Github

(back to top)

Cross-Embodiment Benchmarks

Title Venue Date Code
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation arXiv 2024-12-18 Project
GENESIS: A generative world for general-purpose robotics & embodied AI learning - - Star Github
ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI arXiv 2024-10-01 Star Github
All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents arXiv 2024-08-20 Dataset
Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence? NeurIPS 2023 2023-03-31 Star Github
Isaac Lab: Orbit: A Unified Simulation Framework for Interactive Robot Learning Environments RA-L 2023 2023-01-10 Star Github

(back to top)

🛠️ Awesome Techniques

Title Venue Date Code
Awesome-Implicit-NeRF-Robotics: Neural Fields in Robotics: A Survey - 2024-10-26 Star Github
Awesome-Video-Robotic-Papers - 2024 Star Github
Awesome-Generalist-Robots-via-Foundation-Models: Neural Fields in Robotics: A Survey - 2024 Star Github
Awesome-Robotics-3D - 2024 Star Github
Awesome-Robotics-Foundation-Models: Foundation Models in Robotics: Applications, Challenges, and the Future - 2023-12-13 Star Github
Awesome-LLM-Robotics - 2022 Star Github

(back to top)

✨ Citation

If you find this repository useful, please consider citing this list:

@misc{bai2024roboticsmanipulation,
    title = {Awesome-Robotics-Manipulation},
    author = {Bai, Shuanghao and Ding, Pengxiang and Zhang, Haoran},
    journal = {GitHub repository},
    url = {https://github.com/BaiShuanghao/Awesome-Robotics-Manipulation},
    year = {2024},
}

About

A comprehensive list of papers about Robot Manipulation, including papers, codes, and related websites.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published