How are regularization images used during LoRA training? #2056
Replies: 4 comments 27 replies
-
Beta Was this translation helpful? Give feedback.
-
Thank you for the explanation. I understand the purpose of why we would want to use regularization images. However, I still don't know how kohya_ss is using them. My original question "Are they used during training the same way as normal images are, except the prompt is in some way changed?" is still open. |
Beta Was this translation helpful? Give feedback.
-
I don't use reg images, I don't even use captions unless I struggle to get the focus. The original training papers and such don't even mention regulation. Since the method and math originates from the papers - I am confident that random people making anime titties aren't any more knowledgeable.
What tilts me a fair bit is that: Beyond some of the more arcane aspects, all the details and information on how these things work is on the papers, githubs or other similar sources and documentation. Obviously you can't really know whether something is actually implemented and working correctly. By bar for whether a source is worth a damn, is really whether they cite the sources correctly. No I don't mean "link to where I read this" but citing and sourcing at least in manner that would pass in academic or professional setting. But when it comes to the discussion or reg images. If you get good quality results that meet the criteria you set, without them. Do you actually need them? Because some issue people say can and should be solved with reg images, I have solved by switching the model I train from - only going to base SDXL if I am desperate. Currently the best I have come across and what I use is FluentlyXL, no idea why but it just gives me cleaner results.
Not the "passage of time" per se. Since all the random components in the training are generated from clock, which is fiddly piece of tech. Minute alternations on what the time is affect things. Nothing in these are 100% deterministic - we are talking about something that is inherently statistical afterall. But the time is never the reason why something works or doesn't work. It is just... tiny little things. But tiny little things can cause noticeable differences, but not to degree where it is "did work" or "didn't work". Like I talk alot about this stuff, since it is my current obsessive hobby (well training is. I am quite bad at generating, I just like the puzzle of training). And I always wonder whether I am doing "something wrong" to what other people are doing, because I can get things done just fine without captions, reg images, or... other bullshit reddit and whatever deems absolutely mandatory for getting good quality results. I don't do any of that and I get good results. If I can't get something to wrong I deep dive to documentation and papers, and the answers are generally just there in clear print. |
Beta Was this translation helpful? Give feedback.
-
Just my 5 cents from experience about class images and tagging: I can't support the statement that tagging is unimportant. It is very important if the training subject includes multiple concepts, and a face already has plenty sub concepts. Every describable feature is its on concept, and everything you want to be changable need to be tagged accurately. But the way you need to tag can vary a lot, like Anime style models tend to be good with keywords, realistic models more with proper sentences. Starting a tagging with "a photo of" can make a big difference how quickly you get a good quality result when training with realistic images. It does not mean that you need to use the term "photo" later when rendering images, but helps the model moving the right weights in the latent space during training. |
Beta Was this translation helpful? Give feedback.
-
How are regularization images used during LoRA training? Are they used during training the same way as normal images are, except the prompt is in some way changed?
Beta Was this translation helpful? Give feedback.
All reactions