NovelAI V3 Methods

This document outlines the key methodological improvements implemented in NovelAI V3, with particular emphasis on noise scaling for high-resolution coherence.

Maximum Noise Level (σmax) Scaling

The choice of maximum noise level (σmax) critically affects global image coherence, particularly at high resolutions. SDXL's default σmax = 14.6 proves insufficient for maintaining coherence in high-resolution images, leading to artifacts such as multi-body generation issues.

Noise Scaling Theory

The relationship between noise levels and image resolution follows a fundamental scaling principle:

For a resolution increase by factor k:

σ_new = k · σ_base  (length scaling)
σ_variance = k² · σ_base  (area scaling)

This scaling maintains the signal-to-noise ratio (SNR) across resolutions. The quadratic relationship arises from the assumption that signal redundancy scales with image area.

Mathematical Basis

Given an image x₀ with resolution R:

x_t = α_tx₀ + σ_tε, where ε ~ N(0,I)
SNR = ||α_tx₀||² / ||σ_tε||²

To maintain consistent SNR when scaling resolution:

SNR_new = ||α_t(kx₀)||² / ||σ_new_tε||² = SNR_original

Therefore:

σ_new = k · σ_base   (for dimension scaling)
σ_new = k² · σ_base  (for area scaling, assuming full redundancy)

Empirical Results

At standard SDXL resolutions:

σmax = 14.6 (default): Shows multi-body artifacts
σmax = 29.0 (2x): Resolves global coherence issues
σmax ≈ 20000 (∞): Enables proper mean color prediction

Progressive noise sequence example:

σmax = 14.6: [14.6 → 10.8 → 8.3 → 6.6 → 5.4]
σmax = 29.0: [29.0 → 17.8 → 12.4 → 9.2 → 7.2]

Implementation Rule

For practical implementation, follow this scaling rule:

When doubling canvas length (4x area): Double σmax
This represents an upper bound assuming full signal redundancy
The approximation improves at higher resolutions
Particularly effective for resolutions > 1024²

Integration with Other Methods

The σmax scaling works in conjunction with:

v-prediction parameterization
Zero Terminal SNR training
Karras noise scheduling (ρ = 7.0)

Together, these methods ensure both local detail preservation and global coherence across all image resolutions.

Note: These methods represent a distinct approach from Flow Matching techniques. See Flow Matching for that alternative approach.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NovelAI V3 Methods

NovelAI V3 Methods

Maximum Noise Level (σmax) Scaling

Noise Scaling Theory

Mathematical Basis

Empirical Results

Implementation Rule

Integration with Other Methods

Clone this wiki locally