Description | Paper | Code | |
---|---|---|---|
SAM | Segmentation | https://arxiv.org/abs/2304.02643 | https://github.com/facebookresearch/segment-anything |
YOLO-World | Open-Vocabulary Detection | https://arxiv.org/abs/2401.17270 | https://github.com/AILab-CVC/YOLO-World |
Description | Paper | Code | |
---|---|---|---|
SAM3D | Segmentation | https://arxiv.org/abs/2306.03908 | https://github.com/Pointcept/SegmentAnything3D |
PointMixer | Understanding | https://arxiv.org/abs/2401.17270 | https://github.com/LifeBeyondExpectations/PointMixer |
Description | Paper | Code | |
---|---|---|---|
GPT4V | MLM(Image+Language->Language) | https://arxiv.org/abs/2303.08774 | |
Claude3-Opus | MLM(Image+Language->Language) | https://www.anthropic.com/news/claude-3-family | |
GLaMM | Pixel Grounding | https://arxiv.org/abs/2311.03356 | https://github.com/mbzuai-oryx/groundingLMM |
All-Seeing | Pixel Grounding | https://arxiv.org/abs/2402.19474 | https://github.com/OpenGVLab/all-seeing |
LEO | 3D | https://arxiv.org/abs/2311.12871 | https://github.com/embodied-generalist/embodied-generalist |
Description | Paper | Code | |
---|---|---|---|
Vid2Robot | https://vid2robot.github.io/vid2robot.pdf | ||
RT-Trajectory | https://arxiv.org/abs/2311.01977 | ||
MimicPlay | https://mimic-play.github.io/assets/MimicPlay.pdf | https://github.com/j96w/MimicPlay |
Description | Paper | Code | |
---|---|---|---|
UMI | Two-Fingers | https://arxiv.org/abs/2402.10329 | https://github.com/real-stanford/universal_manipulation_interface |
DexCap | Five-Fingers | https://dex-cap.github.io/assets/DexCap_paper.pdf | https://github.com/j96w/DexCap |
HIRO Hand | Hand-over-hand | https://sites.google.com/view/hiro-hand |
Description | Paper | Code | |
---|---|---|---|
MimicGen | https://arxiv.org/abs/2310.17596 | https://github.com/NVlabs/mimicgen_environments | |
RoboGen | https://arxiv.org/abs/2311.01455 | https://github.com/Genesis-Embodied-AI/RoboGen |
Description | Paper | Code | |
---|---|---|---|
Diffusion Policy | https://arxiv.org/abs/2303.04137 | https://github.com/real-stanford/diffusion_policy | |
ACT | https://arxiv.org/abs/2304.13705 | https://github.com/tonyzhaozh/act |
Description | Paper | Code | |
---|---|---|---|
COPA | https://arxiv.org/abs/2403.08248 | ||
ManipLLM | https://arxiv.org/abs/2312.16217 | ||
ManipVQA | https://arxiv.org/pdf/2403.11289.pdf | https://github.com/SiyuanHuang95/ManipVQA |
Description | Paper | Code | |
---|---|---|---|
OLAF | https://arxiv.org/pdf/2310.17555 | ||
YAYRobot | https://arxiv.org/abs/2403.12910 | https://github.com/yay-robot/yay_robot |
Description | Paper | Code | |
---|---|---|---|
SayCan | API Level | https://arxiv.org/abs/2204.01691 | https://github.com/google-research/google-research/tree/master/saycan |
VILA | Prompt Level | https://arxiv.org/abs/2311.17842 |