vladmandic · vladmandic · Dec 30, 2023 · Dec 30, 2023 · Dec 30, 2023 · Dec 30, 2023
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,7 +1,99 @@
 # Change Log for SD.Next
 
+## Update for 2023-01-03
+
+Following-up on a major release, some more functionality in new Control module  
+And it also includes fixes for all reported issues so far  
+
+- **Control**:
+  - add **inpaint** support  
+    applies to both *img2img* and *controlnet* workflows  
+    *note*: set blur to level you desire  
+  - add **outpaint** support  
+    applies to both *img2img* and *controlnet* workflows  
+    *note*: increase denoising strength since outpainted area is blank by default  
+  - add **marigold** depth map processor  
+    this is state-of-the-art depth estimation model, but its quite heavy on resources  
+  - configurable output folder in settings  
+  - auto-refresh available models on tab activate  
+  - reduce usage of temp files  
+  - add context menu to action buttons  
+  - resize by now applies to input image or frame individually  
+    allows for processing where input images are of different sizes  
+  - fix input image size  
+  - fix video color mode  
+  - fix correct image mode  
+  - fix batch/folder/video modes  
+- **Improvements**  
+  - allow deployment without git clone  
+    for example, you can now deploy a zip of the sdnext folder  
+  - hypertile: enable vae tiling  
+  - hypertile: add autodetect optimial value  
+    set tile size to 0 to use autodetected value  
+  - cli: sdapi.py allow manual api invoke  
+    example: `python cli/sdapi.py /sdapi/v1/sd-models`  
+  - memory: add ram usage monitoring in addition to gpu memory usage monitoring  
+  - vae: enable taesd batch decode  
+    enable/disable with settings -> diffusers > vae slicing  
+  - updated core requirements  
+- **Compile**
+  - new option: **fused projections**  
+    pretty much free 5% performance boost for compatible models  
+    enable in settings -> compute settings  
+  - new option: **dynamic quantization** (experimental)  
+    reduces memory usage and increases performance  
+    enable in settings -> compute settings  
+    best used together with torch compile: *inductor*  
+    this feature is highly experimental and will evolve over time  
+    requires nightly versions of `torch` and `torchao`  
+    > pip install -U --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121  
+    > pip install -U git+https://github.com/pytorch-labs/ao  
+- **IPEX**, thanks @disty0  
+  - better compile support  
+  - remove IPEX / Torch 2.0 specific hijacks  
+  - add `IPEX_SDPA_SLICE_TRIGGER_RATE` and `IPEX_ATTENTION_SLICE_RATE` env variables  
+- **Fixes**  
+  - ipadapter: allow changing of model/image on-the-fly  
+  - ipadapter: fix fallback of cross-attention on unload  
+  - python: fix python 3.9 compatibility  
+  - img2img: clip and blip interrogate  
+  - img2img: sampler selection offset  
+  - sampler: guard against invalid sampler index  
+  - config: reset default cfg scale to 6.0  
+  - processing: correct display metadata  
+  - live preview: fix when using `bfloat16`
+  - upscale: fix ldsr
+
 ## Update for 2023-12-29
 
+To wrap up this amazing year, we're releasing a new version of [SD.Next](https://github.com/vladmandic/automatic), this one is absolutely massive!  
+
+### Highlights  
+
+- Brand new Control module for *text, image, batch and video* processing  
+  Native implementation of all control methods for both *SD15* and *SD-XL*  
+  ▹ **ControlNet | ControlNet XS | Control LLLite | T2I Adapters | IP Adapters**  
+  For details, see [Wiki](https://github.com/vladmandic/automatic/wiki/Control) documentation:  
+- Support for new models types out-of-the-box  
+  This brings number of supported t2i/i2i model families to 13!  
+  ▹ **Stable Diffusion 1.5/2.1 | SD-XL | LCM | Segmind | Kandinsky | Pixart-α | Würstchen | aMUSEd | DeepFloyd IF | UniDiffusion | SD-Distilled | BLiP Diffusion | etc.**  
+- New video capabilities:  
+  ▹ **AnimateDiff | SVD | ModelScope | ZeroScope**  
+- Enhanced platform support  
+  ▹ **Windows | Linux | MacOS** with **nVidia | AMD | IntelArc | DirectML | OpenVINO | ONNX+Olive** backends  
+- Better onboarding experience (first install)  
+  with all model types available for single click download & load (networks -> reference)  
+- Performance optimizations!
+  For comparisment of different processing options and compile backends, see [Wiki](https://github.com/vladmandic/automatic/wiki/Benchmark)  
+  As a highlight, we're reaching **~100 it/s** (no tricks, this is with full features enabled and end-to-end on a standard nVidia RTX4090)  
+- New [custom pipelines](https://github.com/vladmandic/automatic/blob/dev/scripts/example.py) framework for quickly porting any new pipeline  
+
+And others improvements in areas such as: Upscaling (up to 8x now with 40+ available upscalers), Inpainting (better quality), Prompt scheduling, new Sampler options, new LoRA types, additional UI themes, better HDR processing, built-in Video interpolation, parallel Batch processing, etc.  
+
+Plus some nifty new modules such as **FaceID** automatic face guidance using embeds during generation and **Depth 3D** image to 3D scene
+
+### Full changelog
+
 - **Control**  
   - native implementation of all image control methods:  
     **ControlNet**, **ControlNet XS**, **Control LLLite**, **T2I Adapters** and **IP Adapters**  
@@ -919,7 +1011,7 @@ Actual changelog is:
 
 - original
   - fix hires secondary sampler  
-    this now fully obsoletes `fallback_sampler` and `force_latent_sampler`  
+    this now fully obsoletes `fallback_sampler` and `force_hr_sampler_name`  
 
 
 ## Update for 2023-07-18
@@ -959,7 +1051,7 @@ Trying to unify settings for both original and diffusers backend without introdu
 - renamed **hires fix** to **second pass**  
   as that is what it actually is, name hires fix is misleading to start with  
 - actual **hires fix** and **refiner** are now options inside **second pass** section  
-- obsoleted settings -> sampler -> **force_latent_sampler**  
+- obsoleted settings -> sampler -> **force_hr_sampler_name**  
   it is now part of **second pass** options and it works the same for both original and diffusers backend  
   which means you can use different scheduler settings for txt2img and hires if you want  
 - sd-xl refiner will run if its loaded and if second pass is enabled  

diff --git a/cli/sdapi.py b/cli/sdapi.py
@@ -14,6 +14,7 @@
 import requests
 import urllib3
 from util import Map, log
+from rich import print # pylint: disable=redefined-builtin
 
 
 sd_url = os.environ.get('SDAPI_URL', "http://127.0.0.1:7860") # automatic1111 api url root
@@ -225,25 +226,29 @@ async def close():
 
 
 if __name__ == "__main__":
+    sys.argv.pop(0)
     log.setLevel(logging.DEBUG)
     if 'interrupt' in sys.argv:
         asyncio.run(interrupt())
-    if 'progress' in sys.argv:
+    elif 'progress' in sys.argv:
         asyncio.run(progress())
-    if 'progresssync' in sys.argv:
+    elif 'progresssync' in sys.argv:
         progresssync()
-    if 'options' in sys.argv:
+    elif 'options' in sys.argv:
         opt = options()
         log.debug({ 'options' })
         import json
         print(json.dumps(opt['options'], indent = 2))
         log.debug({ 'cmd-flags' })
         print(json.dumps(opt['flags'], indent = 2))
-    if 'log' in sys.argv:
+    elif 'log' in sys.argv:
         get_log()
-    if 'info' in sys.argv:
+    elif 'info' in sys.argv:
         get_info()
-    if 'shutdown' in sys.argv:
+    elif 'shutdown' in sys.argv:
         shutdown()
+    else:
+        res = getsync(sys.argv[0])
+        print(res)
     asyncio.run(close(), debug=True)
     asyncio.run(asyncio.sleep(0.5))
diff --git a/cli/simple-txt2img.js b/cli/simple-txt2img.js
@@ -28,7 +28,7 @@ const sd_options = {
   // second pass: hires
   hr_force: true,
   hr_second_pass_steps: 20,
-  latent_sampler: 'UniPC',
+  hr_sampler_name: 'UniPC',
   denoising_strength: 0.5,
   // second pass: refiner
   refiner_steps: 5,

diff --git a/extensions-builtin/sd-webui-agent-scheduler b/extensions-builtin/sd-webui-agent-scheduler
diff --git a/extensions-builtin/sd-webui-controlnet b/extensions-builtin/sd-webui-controlnet
diff --git a/html/amethyst-nightfall.jpg b/html/amethyst-nightfall.jpg
diff --git a/html/invokeai.jpg → html/invoked.jpg b/html/invokeai.jpg → html/invoked.jpg
diff --git a/installer.py b/installer.py
@@ -815,9 +815,8 @@ def check_version(offline=False, reset=True): # pylint: disable=unused-argument
     if args.skip_all:
         return
     if not os.path.exists('.git'):
-        log.error('Not a git repository')
-        if not args.ignore:
-            sys.exit(1)
+        log.warning('Not a git repository, all git operations are disabled')
+        args.skip_git = True # pylint: disable=attribute-defined-outside-init
     log.info(f'Version: {print_dict(get_version())}')
     if args.version or args.skip_git:
         return

diff --git a/javascript/contextMenus.js b/javascript/contextMenus.js
@@ -103,7 +103,7 @@ function initContextMenu() {
     }
   };
 
-  for (const tab of ['txt2img', 'img2img']) {
+  for (const tab of ['txt2img', 'img2img', 'control']) {
     for (const el of ['generate', 'interrupt', 'skip', 'pause', 'paste', 'clear_prompt', 'extra_networks_btn']) {
       const id = `#${tab}_${el}`;
       appendContextMenuOption(id, 'Copy to clipboard', () => navigator.clipboard.writeText(document.querySelector(`#${tab}_prompt > label > textarea`).value));

diff --git a/javascript/control.js b/javascript/control.js
@@ -1,3 +1,15 @@
+function controlInputMode(inputMode, ...args) {
+  const tab = gradioApp().querySelector('#control-tab-input button.selected');
+  if (!tab) return ['Select', ...args];
+  inputMode = tab.innerText;
+  if (inputMode === 'Image') {
+    if (!gradioApp().getElementById('control_input_select').classList.contains('hidden')) inputMode = 'Select';
+    else if (!gradioApp().getElementById('control_input_resize').classList.contains('hidden')) inputMode = 'Outpaint';
+    else if (!gradioApp().getElementById('control_input_inpaint').classList.contains('hidden')) inputMode = 'Inpaint';
+  }
+  return [inputMode, ...args];
+}
+
 function setupControlUI() {
   const tabs = ['input', 'output', 'preview'];
   for (const tab of tabs) {
@@ -11,6 +23,18 @@ function setupControlUI() {
       c.style.flexGrow = c.style.flexGrow === '0' ? '9' : '0';
     };
   }
+
+  const el = gradioApp().getElementById('control-input-column');
+  if (!el) return;
+  const intersectionObserver = new IntersectionObserver((entries) => {
+    if (entries[0].intersectionRatio > 0) {
+      const tab = gradioApp().querySelector('#control-tabs > .tab-nav > .selected')?.innerText.toLowerCase() || ''; // selected tab name
+      const btn = gradioApp().getElementById(`refresh_${tab}_models`);
+      if (btn) btn.click();
+    }
+  });
+  intersectionObserver.observe(el); // monitor visibility of tab
+
   log('initControlUI');
 }
 

diff --git a/javascript/progressBar.js b/javascript/progressBar.js
@@ -85,7 +85,7 @@ function requestProgress(id_task, progressEl, galleryEl, atEnd = null, onProgres
   let livePreview;
   let img;
 
-  const init = () => {
+  const initLivePreview = () => {
     img = new Image();
     if (parentGallery) {
       livePreview = document.createElement('div');
@@ -123,7 +123,7 @@ function requestProgress(id_task, progressEl, galleryEl, atEnd = null, onProgres
         return;
       }
       setProgress(res);
-      if (res.live_preview && !livePreview) init();
+      if (res.live_preview && !livePreview) initLivePreview();
       if (res.live_preview && galleryEl) img.src = res.live_preview;
       if (onProgress) onProgress(res);
       setTimeout(() => start(id_task, id_live_preview), opts.live_preview_refresh_period || 500);

diff --git a/javascript/sdnext.css b/javascript/sdnext.css
@@ -75,7 +75,7 @@ button.custom-button{ border-radius: var(--button-large-radius); padding: var(--
 #control-result { padding: 0.5em; }
 #control-inputs { margin-top: 1em; }
 #txt2img_prompt_container, #img2img_prompt_container, #control_prompt_container { margin-right: var(--layout-gap) }
-#txt2img_footer, #img2img_footer, #extras_footer, #control_footer { height: fit-content; display: none; }
+#txt2img_footer, #img2img_footer, #control_footer { height: fit-content; display: none; }
 #txt2img_generate_box, #img2img_generate_box, #control_general_box { gap: 0.5em; flex-wrap: wrap-reverse; height: fit-content; }
 #txt2img_actions_column, #img2img_actions_column, #control_actions_column { gap: 0.3em; height: fit-content; }
 #txt2img_generate_box>button, #img2img_generate_box>button, #control_generate_box>button, #txt2img_enqueue, #img2img_enqueue { min-height: 42px; max-height: 42px; line-height: 1em; }
@@ -87,7 +87,7 @@ button.custom-button{ border-radius: var(--button-large-radius); padding: var(--
 #txt2img_actions_column, #img2img_actions_column, #control_actions { flex-flow: wrap; justify-content: space-between; }
 #txt2img_enqueue_wrapper, #img2img_enqueue_wrapper, #control_enqueue_wrapper { min-width: unset !important; width: 48%; }
 .interrogate-col{ min-width: 0 !important; max-width: fit-content; margin-right: var(--spacing-xxl); }
-.interrogate-col>button{ flex: 1; }
+.interrogate-col>button{ flex: 1; width: 7em; max-height: 84px; }
 #sampler_selection_img2img { margin-top: 1em; }
 #txtimg_hr_finalres{ min-height: 0 !important; }
 #img2img_scale_resolution_preview.block{ display: flex; align-items: end; }
@@ -240,6 +240,7 @@ table.settings-value-table td { padding: 0.4em; border: 1px solid #ccc; max-widt
 .extras { gap: 0.2em 1em !important }
 #extras_generate, #extras_interrupt, #extras_skip { display: block !important; position: relative; height: 36px; }
 #extras_upscale { margin-top: 10px }
+#pnginfo_html_info .gradio-html > div { margin: 0.5em; }
 
 /* log monitor */
 .log-monitor { display: none; justify-content: unset !important; overflow: hidden; padding: 0; margin-top: auto; font-family: monospace; font-size: 0.85em; }

diff --git a/modules/control/proc/canny.py b/modules/control/proc/canny.py
@@ -31,6 +31,5 @@ def __call__(self, input_image=None, low_threshold=100, high_threshold=200, dete
 
         if output_type == "pil":
             detected_map = Image.fromarray(detected_map)
-            detected_map = detected_map.convert('L')
 
         return detected_map
diff --git a/modules/control/proc/edge.py b/modules/control/proc/edge.py
@@ -59,6 +59,5 @@ def __call__(self, input_image=None, pf=True, mode='edge', detect_resolution=512
 
         if output_type == "pil":
             edge_map = Image.fromarray(edge_map)
-            edge_map = edge_map.convert('L')
 
         return edge_map
diff --git a/modules/control/proc/marigold/__init__.py b/modules/control/proc/marigold/__init__.py
@@ -0,0 +1,47 @@
+from PIL import Image
+from modules.control.util import HWC3, resize_image
+from modules import devices
+from .marigold_pipeline import MarigoldPipeline
+
+
+class MarigoldDetector:
+    def __init__(self, model):
+        self.model: MarigoldPipeline = model
+
+    @classmethod
+    def from_pretrained(cls, pretrained_model_or_path, cache_dir=None, **load_config):
+        model = MarigoldPipeline.from_pretrained(pretrained_model_or_path, cache_dir=cache_dir, **load_config)
+        return cls(model)
+
+    def to(self, device):
+        self.model.to(device)
+        return self
+
+    def __call__(
+        self,
+        input_image: Image,
+        denoising_steps: int = 10,
+        ensemble_size: int = 10,
+        processing_res: int = 768,
+        match_input_res: bool = True,
+        color_map: str = "Spectral",
+        output_type=None,
+    ):
+        self.model.to(device=devices.device, dtype=devices.dtype)
+        res = self.model(
+            input_image,
+            denoising_steps=denoising_steps,
+            ensemble_size=ensemble_size,
+            processing_res=processing_res,
+            match_input_res=match_input_res,
+            color_map=color_map if color_map != 'None' else 'Spectral',
+            batch_size=1,
+            show_progress_bar=True,
+        )
+        depth_map = res.depth_np
+        depth_colored = res.depth_colored
+
+        if output_type == "pil":
+            depth_map = Image.fromarray(depth_map)
+
+        return depth_colored if color_map != 'None' else depth_map