This project combines Segment-Anything-2 (SAM2) with the Stereolabs ZED-Stereocamera, leveraging the Python API provided by the ZED SDK to extract depth information and segment Regions of Interest (ROIs) in real-time.
Optional integration with GroundingDINO allows for the use of text prompts to generate bounding boxes as input for SAM2, enhancing segmentation flexibility.
For more details about the ZED Python API, visit the Stereolabs documentation.
This project requires the integration of multiple models and dependencies to achieve real-time segmentation and depth estimation. Below is an outline of the components and their purposes:
- SAM2: Instance segmentation using flexible prompts for guidance.
- GroundingDINO: Text-based object detection to generate bounding box prompts for SAM2.
- Nakama Pyzed Wrapper: Streamlined interaction with the ZED stereocamera.
Note:
- For the ZED camera, CUDA 12.1 is required to use Python SDK 4.1. The SDK installer handles this automatically.
- For other models, any CUDA version ≥ 11.3 can be used.
After setting up CUDA, ensure the correct version is being used by configuring the CUDA_HOME
environment variable:
export CUDA_HOME=/path/to/desired_cuda_version
To verify the CUDA version:
which nvcc
The output should point to the desired CUDA version, e.g., /usr/local/cuda-12.6/bin/nvcc
. Use the corresponding path to set CUDA_HOME
. Confirm the setup:
echo $CUDA_HOME
Note: It is recommended to install SAM2 before GroundingDINO, as SAM2's installer includes required dependencies for GroundingDINO.
-
Install requirements:
pip install -e .
-
Download checkpoints and configurations:
cd checkpoints ./download_ckpts.sh
-
Change to the
GroundingDINO
directory:cd GroundingDINO/
-
Install required dependencies:
pip install -e .
-
Download pre-trained model weights:
mkdir weights cd weights wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth cd ..
-
Install the ZED SDK (details here).
Note: This project has been tested with ZED SDK v4.1, which integrates the AI mode for depth estimation. The installer will prompt you to install CUDA 12 if it is not already configured.
-
Install the ZED Python API:
- Globally: Python API Installation
- Within a virtual environment: Python API Virtual Env
-
Install the Nakama Pyzed Wrapper as a package or clone it into your project: Nakama Pyzed Wrapper
The main pipeline code is located at:
scripts/sam2_track_zed.py
-
General configurations are in the
configurations
folder. Update paths to match your machine setup. -
Nakama Pyzed Wrapper-specific settings are in:
scripts/pyzed_wrapper/wrapper_settings.py