Replies: 1 comment 2 replies
-
Make a more diverse dataset. You have basically just 4 variants of the same exact picture. The Furkan's example ain't that good either. You see the limited angles and expression - that is because they generally have a fairly limited datasets. Can't commend beyond that as they lock everything behind paywalls. Anyhow. Take picture of your face from like 4-5 angles at 4-5 heights, and then add few 3-5 distances. Give slightly different expression in all of them. These training methods try to find the similarity between the dataset's contents. It absolutely does not know what it should be learning it takes a statistical model from the dataset. You can do a LoRA with like 10 good pictures, 5 if you got a very clear category of a thing and regulation to support it as This is just a type of that . However you can compensate for lower and less clear pictures by having more - to certain extend, bad ones just need to be removed. This is a thing youtube videos don't teach you, especially clickbaity ones. |
Beta Was this translation helpful? Give feedback.
-
issue in Generating headshot photo of myself with SDXL lora training
I am using kohya_ss SDXL model to generate Headshot Photo of myself. But i don't exactly get the similar face
Here are my Inputs and Params :
Input Photos:
Params :
%cd /kaggle/working/kohya_ss/
!accelerate launch --num_cpu_threads_per_process=2 "./sdxl_train_network.py"
--pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0"
--train_data_dir="/kaggle/working/training-data/img"
--output_dir="/content/drive/MyDrive/model-sdxl"
--logging_dir="/kaggle/working/training-data/log"
--reg_data_dir="/kaggle/working/training-data/reg"
--resolution="768,768"
--network_alpha="16"
--network_dim=32
--save_model_as=safetensors
--network_module=networks.lora
--text_encoder_lr=0.0004
--unet_lr=0.0004
--output_name="dreambooth-lora"
--lr_scheduler_num_cycles="30"
--no_half_vae
--learning_rate=1e-5
--lr_scheduler="cosine"
--train_batch_size="1"
--max_train_steps=5000
--save_every_n_epochs="1"
--mixed_precision="fp16"
--save_precision="fp16"
--optimizer_type="Adafactor"
--optimizer_args scale_parameter=False relative_step=False warmup_init=False
--max_data_loader_n_workers="0"
--gradient_checkpointing
--xformers
--lowram
--face_crop_aug_range="2.0,4.0"
--clip_skip 1 --noise_offset=0.1
--mem_eff_attn --cache_latents --cache_latents_to_disk
Output generated :
I have tried increasing Number steps and learning rate . but in the result the generated images are over cooked (over-fitting) it give the input image as the generated image
I am expecting exact results like Mr.FurkanGozukara. He have managed to generate better results
Note: I am using caption for each image and 2500 reg images
Thank you !!
Beta Was this translation helpful? Give feedback.
All reactions