Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: too many tokens in negative causes weird behavior #4

Closed
2 tasks done
JenXIII opened this issue Jul 18, 2023 · 9 comments
Closed
2 tasks done

[Bug]: too many tokens in negative causes weird behavior #4

JenXIII opened this issue Jul 18, 2023 · 9 comments

Comments

@JenXIII
Copy link

JenXIII commented Jul 18, 2023

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

Have you read FAQ on README?

  • I have updated WebUI and this extension to the latest version

What happened?

<75 tokens, seems WAI
image

If we double the negative prompt then it will start to produce two sets of images
image

Behavior holds at batch 24 (12 one scene, 12 another), even with only slightly over 75 tokens in negative
image

Going down to batch 14 we start to see one half not follow the prompt well
image

This deteriorates further at batch 12
image

SD starts to collapse at batch 10
image

Going down to 73 tokens in negative and we recover expected function
image

Alternatively, switching scheduler to DDIM with 77 tokens in negative seems more resistant to collapse but something is still wrong (noisier, more washed out color than before)
image

Also of note, with 73 tokens in negative batch 15 works fine
image

But go to 77 tokens and it throws an error

*** Error completing request
*** Arguments: ('task(1iahtw4e5tw20iv)', '(masterpiece), (best quality), (ultra-detailed), photorealistic, (best illustration), (an extremely delicate and beautiful), 1girl, solo, upper body, hiryuuchan, brown hair, brown eyes, (one side up), wind, orange kimono, blue sky, detailed scenery, finely detailed iris, <lora:hiryuu_nai_11-24:1:OUTD>', 'easynegativev2, (bad-hands-5:1), (verybadimagenegative:0.9), error, blurry, jpeg artifacts, cropped, worst quality, low quality, normal quality, (worst quality, low quality:1.4), bad anatomy, (extra hand), extra digits, extra fingers, extra limb, extra arm, bad quality', [], 50, 6, False, False, 1, 1, 6.5, 1728598878.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 0, '', '', [], 0, 0, 0, 0, 0, 0.25, False, False, {'ad_model': 'face_yolov8n.pt', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'inpaint_global_harmonious', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': <object object at 0x000001448A1D9140>}, {'ad_model': 'None', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'inpaint_global_harmonious', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': <object object at 0x000001448A1D8550>}, False, False, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, None, 'Refresh models', True, 15, 8, 'mm_sd_v15.ckpt', <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000001448AC69990>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000001448AC6B7F0>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000001448AC68610>, 'NONE:0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0\nALL:1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1\nINS:1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0\nIND:1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0\nIND_PLUS:1,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0\nIND_PLUS_a:1,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0\nIND_PLUS_b:1,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0\nIND_PLUS_c:1,0,0,0,1,1,0,1,0,0,0,0,0,0,0,0,0\nINALL:1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0\nMIDD:1,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0\nINS_MIDD:1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0\nINS_MIDD_a:1,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0\nINS_MIDD_b:1,1,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0\nINS_MIDD_c:1,1,1,0,1,1,1,1,1,1,1,1,0,0,0,0,0\nINS_MIDD_d:1,1,1,1,0,1,1,1,1,1,1,1,0,0,0,0,0\nINS_MIDD_e:1,1,1,1,1,0,1,1,1,1,1,1,0,0,0,0,0\nINS_MIDD_de:1,1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,0\nINS_MIDD_f:1,1,1,1,1,1,0,1,1,1,1,1,0,0,0,0,0\nINS_MIDD_g:1,1,1,1,1,1,1,0,1,1,1,1,0,0,0,0,0\nINS_MIDD_h:1,1,1,1,1,1,1,1,0,1,1,1,0,0,0,0,0\nINS_MIDD_i:1,1,1,1,1,1,1,1,1,0,1,1,0,0,0,0,0\nINS_MIDD_j:1,1,1,1,1,1,1,1,1,1,0,1,0,0,0,0,0\nINS_MIDD_k:1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0\nOUTD:1,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0\nOUTD_1:1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0\nOUTD_2:1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0\nOUTD_3:1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0\nOUTD_4:1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0\nOUTD_12:1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0\nOUTD_23:1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0\nOUTD_34:1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0\nOUTD_13:1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0\nOUTD_14:1,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0\nOUTD_24:1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0\nOUTD_234:1,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0\nOUTD_134:1,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0\nOUTD_124:1,0,0,0,0,0,0,0,1,1,0,1,0,0,0,0,0\nOUTD_123:1,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0\nOUTS:1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1\nOUTALL:1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1\nOUTALL_a:1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0\nOUTALL_b:1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,1\nOUTALL_c:1,0,0,0,0,0,0,0,1,1,1,1,1,1,0,1,1\nOUTALL_d:1,0,0,0,0,0,0,0,1,1,1,1,1,0,1,1,1\nOUTALL_e:1,0,0,0,0,0,0,0,1,1,1,1,0,1,1,1,1\nMIDD_OUTS:1,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1\nINS_OUTD:1,1,1,1,0,0,0,1,1,1,1,1,0,0,0,0,0\nALL0.5:0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5\nLNONE:0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0\nLALL:1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1\nLINS:1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0\nLIND:1,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0\nLINALL:1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0\nLMIDD:1,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0\nLOUTD:1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0\nLOUTS:1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1\nLOUTALL:1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1', True, 0, 'values', '0,0.25,0.5,0.75,1', 'Block ID', 'IN05-OUT05', 'none', '', '0.5,1', 'BASE,IN00,IN01,IN02,IN03,IN04,IN05,IN06,IN07,IN08,IN09,IN10,IN11,M00,OUT00,OUT01,OUT02,OUT03,OUT04,OUT05,OUT06,OUT07,OUT08,OUT09,OUT10,OUT11', 1.0, 'black', '20', False, 'ATTNDEEPON:IN05-OUT05:attn:1\n\nATTNDEEPOFF:IN05-OUT05:attn:0\n\nPROJDEEPOFF:IN05-OUT05:proj:0\n\nXYZ:::1', False, '\n            <h3><strong>Combinations</strong></h3>\n            Choose a number of terms from a list, in this case we choose two artists\n            <code>{2$$artist1|artist2|artist3}</code>\n            If $$ is not provided, then 1$$ is assumed.\n            <br>\n            A range can be provided:\n            <code>{1-3$$artist1|artist2|artist3}</code>\n            In this case, a random number of artists between 1 and 3 is chosen.\n            <br/><br/>\n\n            <h3><strong>Wildcards</strong></h3>\n            <p>Available wildcards</p>\n            <ul>\n        <li>__angle__</li><li>__background__</li><li>__bra_colors__</li><li>__bra_patterns__</li><li>__bra_type__</li><li>__clothing__</li><li>__footwear__</li><li>__limbwear__</li><li>__location__</li><li>__underwear__</li><li>__view__</li></ul>\n            <br/>\n            <code>WILDCARD_DIR: scripts/wildcards</code><br/>\n            <small>You can add more wildcards by creating a text file with one term per line and name is mywildcards.txt. Place it in scripts/wildcards. <code>__mywildcards__</code> will then become available.</small>\n        ', False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, None, None, False, None, None, False, None, None, False, 50) {}
    Traceback (most recent call last):
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\call_queue.py", line 55, in f
        res = list(func(*args, **kwargs))
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\call_queue.py", line 35, in f
        res = func(*args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\txt2img.py", line 57, in txt2img
        processed = processing.process_images(p)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\processing.py", line 620, in process_images
        res = process_images_inner(p)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack
        return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\processing.py", line 739, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\processing.py", line 992, in sample
        samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 439, in sample
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 278, in launch_sampling
        return func()
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 439, in <lambda>
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 626, in sample_dpmpp_2m_sde
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 177, in forward
        x_out[a:b] = self.inner_model(x_in[a:b], sigma_in[a:b], cond=make_condition_dict(c_crossattn, image_cond_in[a:b]))
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
        eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
        return self.inner_model.apply_model(*args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in <lambda>
        setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in __call__
        return self.__orig_func(*args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model
        x_recon = self.model(x_noisy, t, **cond)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1329, in forward
        out = self.diffusion_model(x, t, context=cc)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\modules\sd_unet.py", line 91, in UNetModel_forward
        return ldm.modules.diffusionmodules.openaimodel.copy_of_UNetModel_forward_for_webui(self, x, timesteps, context, *args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py", line 776, in forward
        h = module(h, emb, context)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\extensions\sd-webui-animatediff\scripts\animatediff.py", line 21, in mm_tes_forward
        x = layer(x, context)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Novel AI Diffusion\Stable Diffusion git\stable-diffusion-webui\extensions\sd-webui-animatediff\motion_module.py", line 76, in forward
        hidden_states = torch.stack([input_cond, input_uncond], dim=0)
    RuntimeError: stack expects each tensor to be equal size, but got [8, 320, 64, 64] at entry 0 and [7, 320, 64, 64] at entry 1

Steps to reproduce the problem

See attached screenshots

What should have happened?

It should apply consistent inputs to all frames

Commit where the problem happens

webui:
version: v1.4.1  •  python: 3.10.6  •  torch: 2.0.1+cu118  •  xformers: N/A  •  gradio: 3.32.0  •  checkpoint: e9a14f558d

extension:
sd-webui-animatediff https://github.com/continue-revolution/sd-webui-animatediff master [e8c88a4]

What browsers do you use to access the UI ?

No response

Command Line Arguments

--opt-sdp-attention --no-half-vae

Console logs

See above

Additional information

No response

@WuKaiYi
Copy link

WuKaiYi commented Jul 18, 2023

一樣的情況,大多數時候無法保持圖像的一致,不過非常感謝作者製作了這個插件

@thezveroboy
Copy link

thezveroboy commented Jul 18, 2023

the problem occurs if there are more than 75 tokens in the promt or in the negative

@xdomiall
Copy link

Thank you so much for creating this extension! I can also confirm that 75 tokens is the limit for pos/negative prompts before the frames get split in the middle

@continue-revolution
Copy link
Owner

Interesting. I will use your example to test on my side. Will learn how A1111 implemented infinite prompts

@Balladie
Copy link

Balladie commented Jul 24, 2023

same problem here and temporarily resolved by using less or equal than 75 tokens for both positive and negative prompts

@xdomiall
Copy link

xdomiall commented Jul 24, 2023

Seems like the latest update completely broke the generation, now with <75 tokens the animation gets broken up into 2 08a4086
image

@exol1n
Copy link

exol1n commented Jul 25, 2023

In my case, the decision to use less than 75 tokens did not help before either.
3080 10gb
set COMMANDLINE_ARGS= --xformers --medvram --no-half-vae
GIF

@Miczu
Copy link

Miczu commented Sep 10, 2023

So I messed around in code and found that changing value from 2 to 1 helps in my generations
motion_module.py
def forward(self, hidden_states, encoder_hidden_states=None, attention_mask=None): video_length = hidden_states.shape[0] // 1 # TODO: config this value in scripts
Tested on the same seed with all same parameters. So far it is always single animation instead of my previous experience of always having perfectly split in half two different animations (no matter the video length).

Easy change to test, so anyone is free to give feedback.

@continue-revolution
Copy link
Owner

#83

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants