Awesome-Embodied-AI

Scene Understanding

Image

	Description	Paper	Code
SAM	Segmentation	https://arxiv.org/abs/2304.02643	https://github.com/facebookresearch/segment-anything
YOLO-World	Open-Vocabulary Detection	https://arxiv.org/abs/2401.17270	https://github.com/AILab-CVC/YOLO-World

Point Cloud

	Description	Paper	Code
SAM3D	Segmentation	https://arxiv.org/abs/2306.03908	https://github.com/Pointcept/SegmentAnything3D
PointMixer	Understanding	https://arxiv.org/abs/2401.17270	https://github.com/LifeBeyondExpectations/PointMixer

Multi-Modal Grounding

	Description	Paper	Code
GPT4V	MLM(Image+Language->Language)	https://arxiv.org/abs/2303.08774
Claude3-Opus	MLM(Image+Language->Language)	https://www.anthropic.com/news/claude-3-family
GLaMM	Pixel Grounding	https://arxiv.org/abs/2311.03356	https://github.com/mbzuai-oryx/groundingLMM
All-Seeing	Pixel Grounding	https://arxiv.org/abs/2402.19474	https://github.com/OpenGVLab/all-seeing
LEO	3D	https://arxiv.org/abs/2311.12871	https://github.com/embodied-generalist/embodied-generalist

Data Collection

From Video

	Paper	Code
Vid2Robot	https://vid2robot.github.io/vid2robot.pdf
RT-Trajectory	https://arxiv.org/abs/2311.01977
MimicPlay	https://mimic-play.github.io/assets/MimicPlay.pdf	https://github.com/j96w/MimicPlay

Hardware

	Description	Paper	Code
UMI	Two-Fingers	https://arxiv.org/abs/2402.10329	https://github.com/real-stanford/universal_manipulation_interface
DexCap	Five-Fingers	https://dex-cap.github.io/assets/DexCap_paper.pdf	https://github.com/j96w/DexCap
HIRO Hand	Hand-over-hand	https://sites.google.com/view/hiro-hand

Generative Simulation

	Description	Paper	Code
MimicGen		https://arxiv.org/abs/2310.17596	https://github.com/NVlabs/mimicgen_environments
RoboGen		https://arxiv.org/abs/2311.01455	https://github.com/Genesis-Embodied-AI/RoboGen

Action Output

Generative Imitation Learning

	Description	Paper	Code
Diffusion Policy		https://arxiv.org/abs/2303.04137	https://github.com/real-stanford/diffusion_policy
ACT		https://arxiv.org/abs/2304.13705	https://github.com/tonyzhaozh/act

Affordance Map

	Description	Paper	Code
CLIPort	Pick&Place	https://arxiv.org/pdf/2109.12098.pdf	https://github.com/cliport/cliport
Robo-Affordances	Contact&Post-contact trajectories	https://arxiv.org/abs/2304.08488	https://github.com/shikharbahl/vrb
Robo-ABC		https://arxiv.org/abs/2401.07487	https://github.com/TEA-Lab/Robo-ABC
Where2Explore	Few shot learning from semantic similarity	https://proceedings.neurips.cc/paper_files/paper/2023/file/0e7e2af2e5ba822c9ad35a37b31b5dd4-Paper-Conference.pdf
Move as You Say, Interact as You Can	Affordance to motion from diffusion model	https://arxiv.org/pdf/2403.18036.pdf
AffordanceLLM	Grounding affordance with LLM	https://arxiv.org/pdf/2401.06341.pdf
Environment-aware Affordance		https://proceedings.neurips.cc/paper_files/paper/2023/file/bf78fc727cf882df66e6dbc826161e86-Paper-Conference.pdf
OpenAD	Open-Voc Affordance Detection from point cloud	https://www.csc.liv.ac.uk/~anguyen/assets/pdfs/2023_OpenAD.pdf	https://github.com/Fsoft-AIC/Open-Vocabulary-Affordance-Detection-in-3D-Point-Clouds
RLAfford	End-to-End affordance learning with RL	https://gengyiran.github.io/pdf/RLAfford.pdf
General Flow	Collect affordance from video	https://general-flow.github.io/general_flow.pdf	https://github.com/michaelyuancb/general_flow
PreAffordance	Pre-grasping planning	https://arxiv.org/pdf/2404.03634.pdf
ScenFun3d	Fine-grained functionality&affordance in 3D scene	https://aycatakmaz.github.io/data/SceneFun3D-preprint.pdf	https://github.com/SceneFun3D/scenefun3d

Question&Answer from LLM

	Paper	Code
COPA	https://arxiv.org/abs/2403.08248
ManipLLM	https://arxiv.org/abs/2312.16217
ManipVQA	https://arxiv.org/pdf/2403.11289.pdf	https://github.com/SiyuanHuang95/ManipVQA

Language Corrections

	Description	Paper	Code
OLAF		https://arxiv.org/pdf/2310.17555
YAYRobot		https://arxiv.org/abs/2403.12910	https://github.com/yay-robot/yay_robot

Planning from LLM

	Description	Paper	Code
SayCan	API Level	https://arxiv.org/abs/2204.01691	https://github.com/google-research/google-research/tree/master/saycan
VILA	Prompt Level	https://arxiv.org/abs/2311.17842

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-Embodied-AI

Scene Understanding

Image

Point Cloud

Multi-Modal Grounding

Data Collection

From Video

Hardware

Generative Simulation

Action Output

Generative Imitation Learning

Affordance Map

Question&Answer from LLM

Language Corrections

Planning from LLM

About

Releases

Packages

Contributors 2

License

yunlongdong/Awesome-Embodied-AI

Folders and files

Latest commit

History

Repository files navigation

Awesome-Embodied-AI

Scene Understanding

Image

Point Cloud

Multi-Modal Grounding

Data Collection

From Video

Hardware

Generative Simulation

Action Output

Generative Imitation Learning

Affordance Map

Question&Answer from LLM

Language Corrections

Planning from LLM

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages