Skip to content

Latest commit

 

History

History
45 lines (30 loc) · 2.3 KB

README.md

File metadata and controls

45 lines (30 loc) · 2.3 KB

Simple Diffusion Pose Gen

A quick attempt to utilize Stable Diffusion (SD) image generative models (v1.5 & XL) and MediaPipe's Pose Landmarker vision models to generate 3D human poses from an AI prompt.

News

  • 2024-11-12: Added a Godot full-body IK pose solver.
  • 2024-10-24: Initial implementation.

Quickstart

  1. Clone repo: git clone https://github.com/jerenchen/simple-diffusion-pose-gen.git.
  2. Change dir into python and (optionally) use a virtual enivronment (e.g. conda).
  3. Install Python depedencies: pip install -r requirements.txt.
  4. Download MediaPipe Pose Landmarker (Full) and save the file under python/tasks.

Python Pose Gen + Godot IK Solver

  1. Inside dir python, Initialize the Python pose-gen service: python posegen.py --base sd15 --steps 8.
  2. Open and run projec.godot inside dir godot with Godot Engine.
  3. Enter a prompot to generate a pose.
SD Godot Pose Gen
Generating a pose using prompt "A basketball player making a 3-pointer jump shot"

Standalone Python Pose Gen

NOTE: The standalone demo requires PySide6 >= v6.7.

Inside dir python, run: python simple-diffusion-pose-gen.py, and then enter a prompt to generate a pose.

SD Settings
SD Pose Gen

Stable Diffusion Model Settings

  • SD15 has fewer model parameters and therefore requires less memory whereas SDXL could generate images with higher fidelity.
  • Hyper-SD (a SD inference acceleration technique) Steps: 2, 4, or 8, trade-off between speed (fewer bigger steps) and quality (more smaller steps).
  • PyTorch device for running SD inference (if available): CPU, CUDA, or MPS (Apple Silicon).

Acknowledgement