Skip to content

Development Setup

Alexander R Izquierdo edited this page Dec 21, 2024 · 1 revision

Development Setup

This page details how to set up and configure the SDXL Training Framework for development and research.

Environment Setup

Prerequisites

  • Python 3.8+
  • CUDA 11.7+
  • 24GB+ VRAM recommended (can run on 24GB with optimizations)
  • Git LFS (for model weights)

Installation Steps

# Clone repository
git clone https://github.com/YourOrg/SDXL-Training-Framework.git
cd SDXL-Training-Framework

# Install in development mode with all extras
pip install -e ".[dev,docs]"

# Verify installation
python -c "import src; print(src.__version__)"

Configuration System

The framework uses a hierarchical configuration system implemented in src/config.py. Configurations can be specified through YAML files and are validated during loading.

Core Configuration Classes

  1. GlobalConfig

    global_config:
      image:
        target_size: [1024, 1024]
        max_size: [1536, 1536]
        min_size: [640, 640]
        max_aspect_ratio: 2.0
      cache:
        cache_dir: "cache"
        use_cache: true
      output_dir: "outputs"
  2. ModelConfig

    model:
      pretrained_model_name: "stabilityai/stable-diffusion-xl-base-1.0"
      num_timesteps: 1000
      sigma_min: 0.002
      sigma_max: 80.0
  3. TrainingConfig

    training:
      batch_size: 4
      gradient_accumulation_steps: 1
      mixed_precision: true
      learning_rate: 4.0e-7
      method: "ddpm"  # or "flow_matching"

Memory Management Configuration

The framework includes comprehensive memory optimization options:

memory:
  enable_24gb_optimizations: true
  layer_offload_fraction: 0.5  # Optimal setting
  enable_activation_offloading: true
  enable_async_offloading: true
  temp_device: "cpu"

Dataset Configuration

Supports both Windows and WSL paths:

data:
  train_data_dir: 
    - "D:\\Datasets\\High-quality-photo10k"  # Windows
    - "/mnt/d/Datasets/collage"              # WSL
  num_workers: 4
  pin_memory: true

Advanced Configuration

Image Processing Settings

The framework supports multiple aspect ratios with automatic bucketing:

supported_dims:
  - [1024, 1024]
  - [1152, 896]
  - [896, 1152]
  - [1216, 832]
  - [832, 1216]
  - [1344, 768]
  - [768, 1344]
  - [1536, 640]
  - [640, 1536]

Training Methods

DDPM Configuration

ddpm:
  prediction_type: "v_prediction"  # v_prediction, epsilon, sample
  snr_gamma: 5.0
  zero_terminal_snr: true
  sigma_max: 20000.0

Tag Weighting

tag_weighting:
  enable_tag_weighting: true
  default_weight: 1.0
  min_weight: 0.1
  max_weight: 10.0

Development Guidelines

Code Organization

The framework follows a modular structure:

  • src/core/ - Core functionality
  • src/data/ - Data processing
  • src/models/ - Model implementations
  • src/training/ - Training methods

Testing

  1. Run the test suite:

    pytest tests/
  2. Validation test:

    python -m src.core.validation.text_to_image

Common Development Tasks

  1. Adding a new training method:

    • Extend src/training/methods/base.py
    • Implement required interfaces
    • Add configuration in Config class
  2. Memory optimizations:

    • Use src/core/memory/ utilities
    • Profile with enable_24gb_optimizations
    • Monitor with W&B integration
  3. Dataset preprocessing:

    • Use src/data/preprocessing/ pipeline
    • Configure caching strategy
    • Implement custom transforms

Troubleshooting

Memory Issues

  • Enable enable_24gb_optimizations
  • Adjust layer_offload_fraction
  • Monitor with memory profiling tools

WSL Path Handling

  • Windows paths are automatically converted
  • Use forward slashes in config files
  • Verify paths with convert_windows_path

Performance Optimization

  • Enable mixed precision training
  • Use gradient checkpointing
  • Optimize worker count
  • Enable async offloading

Next: See Training Pipeline for training process details or Architecture Overview for system design.