-
Notifications
You must be signed in to change notification settings - Fork 0
Development Setup
Alexander R Izquierdo edited this page Dec 21, 2024
·
1 revision
This page details how to set up and configure the SDXL Training Framework for development and research.
- Python 3.8+
- CUDA 11.7+
- 24GB+ VRAM recommended (can run on 24GB with optimizations)
- Git LFS (for model weights)
# Clone repository
git clone https://github.com/YourOrg/SDXL-Training-Framework.git
cd SDXL-Training-Framework
# Install in development mode with all extras
pip install -e ".[dev,docs]"
# Verify installation
python -c "import src; print(src.__version__)"
The framework uses a hierarchical configuration system implemented in src/config.py
. Configurations can be specified through YAML files and are validated during loading.
-
GlobalConfig
global_config: image: target_size: [1024, 1024] max_size: [1536, 1536] min_size: [640, 640] max_aspect_ratio: 2.0 cache: cache_dir: "cache" use_cache: true output_dir: "outputs"
-
ModelConfig
model: pretrained_model_name: "stabilityai/stable-diffusion-xl-base-1.0" num_timesteps: 1000 sigma_min: 0.002 sigma_max: 80.0
-
TrainingConfig
training: batch_size: 4 gradient_accumulation_steps: 1 mixed_precision: true learning_rate: 4.0e-7 method: "ddpm" # or "flow_matching"
The framework includes comprehensive memory optimization options:
memory:
enable_24gb_optimizations: true
layer_offload_fraction: 0.5 # Optimal setting
enable_activation_offloading: true
enable_async_offloading: true
temp_device: "cpu"
Supports both Windows and WSL paths:
data:
train_data_dir:
- "D:\\Datasets\\High-quality-photo10k" # Windows
- "/mnt/d/Datasets/collage" # WSL
num_workers: 4
pin_memory: true
The framework supports multiple aspect ratios with automatic bucketing:
supported_dims:
- [1024, 1024]
- [1152, 896]
- [896, 1152]
- [1216, 832]
- [832, 1216]
- [1344, 768]
- [768, 1344]
- [1536, 640]
- [640, 1536]
ddpm:
prediction_type: "v_prediction" # v_prediction, epsilon, sample
snr_gamma: 5.0
zero_terminal_snr: true
sigma_max: 20000.0
tag_weighting:
enable_tag_weighting: true
default_weight: 1.0
min_weight: 0.1
max_weight: 10.0
The framework follows a modular structure:
-
src/core/
- Core functionality -
src/data/
- Data processing -
src/models/
- Model implementations -
src/training/
- Training methods
-
Run the test suite:
pytest tests/
-
Validation test:
python -m src.core.validation.text_to_image
-
Adding a new training method:
- Extend
src/training/methods/base.py
- Implement required interfaces
- Add configuration in
Config
class
- Extend
-
Memory optimizations:
- Use
src/core/memory/
utilities - Profile with
enable_24gb_optimizations
- Monitor with W&B integration
- Use
-
Dataset preprocessing:
- Use
src/data/preprocessing/
pipeline - Configure caching strategy
- Implement custom transforms
- Use
- Enable
enable_24gb_optimizations
- Adjust
layer_offload_fraction
- Monitor with memory profiling tools
- Windows paths are automatically converted
- Use forward slashes in config files
- Verify paths with
convert_windows_path
- Enable mixed precision training
- Use gradient checkpointing
- Optimize worker count
- Enable async offloading
Next: See Training Pipeline for training process details or Architecture Overview for system design.