-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: hpcai opensora 1.2 - VAE 3D training #621
Merged
Merged
Changes from 27 commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
07c32a8
add vae3d training
SamitHuang 9d1b10b
rm redundant
SamitHuang 585111b
fix format
SamitHuang a2af062
vae split tuple
SamitHuang 6d4d228
rewrite micro_frame_size impl, vae3d reconstruct ok
SamitHuang d404a39
use make_tuple, reconstruct & vae static train ok
SamitHuang 38400aa
better micro batch/frame size writing
SamitHuang c571585
add dynamic shape script
SamitHuang 8c5ac03
fix empty ckpt
SamitHuang 22f8075
fix empty ckpt
SamitHuang 1c8af61
debug: add dynamic shape support
SamitHuang 18b5288
update
SamitHuang f0ab936
fix min(a,b) in dynamic shape
SamitHuang 86426e3
Merge branch 'pr_vae1.2_train' of https://github.com/samithuang/mindo…
SamitHuang 54ac478
default dynamic shape in script
SamitHuang 8b7c9c9
debug: update dyanmic shape train script
SamitHuang fac4272
Merge branch 'master' into pr_vae1.2_train
SamitHuang 1f5aba1
download lpips auto
SamitHuang 7a4bf32
fix typo and linting
SamitHuang 88de7e2
rm redundancy
SamitHuang a93bcb5
small fix
SamitHuang 2b5a31e
linting
SamitHuang a0b0dcc
jit_level O0 for less overflow
SamitHuang 9979d8d
fix docs
SamitHuang 6c00ed3
rm redundancy
SamitHuang 11b80cc
rm file
SamitHuang 2f95e96
update doc
SamitHuang 65baffc
update perf
SamitHuang ca06f32
update config
SamitHuang File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# model | ||
model_type: "OpenSoraVAE_V1_2" | ||
freeze_vae_2d: True | ||
pretrained_model_path: "models/sdxl_vae.ckpt" | ||
|
||
# loss | ||
perceptual_loss_weight: 0.1 | ||
kl_loss_weight: 1.e-6 | ||
use_real_rec_loss: False | ||
use_z_rec_loss: True | ||
use_image_identity_loss: True | ||
mixed_strategy: "mixed_video_image" | ||
mixed_image_ratio: 0.2 | ||
|
||
# data | ||
dataset_name: "video" | ||
csv_path: "../videocomposer/datasets/webvid5_copy.csv" | ||
video_folder: "../videocomposer/datasets/webvid5" | ||
frame_stride: 1 | ||
num_frames: 17 | ||
image_size: 256 | ||
|
||
micro_frame_size: null | ||
micro_batch_size: null | ||
|
||
# training recipe | ||
seed: 42 | ||
use_discriminator: False | ||
dtype: "fp16" | ||
batch_size: 1 | ||
clip_grad: True | ||
max_grad_norm: 1.0 | ||
start_learning_rate: 1.e-5 | ||
scale_lr: False | ||
use_recompute: False | ||
|
||
epochs: 2000 | ||
ckpt_save_interval: 100 | ||
init_loss_scale: 1. | ||
|
||
scheduler: "constant" | ||
use_ema: False | ||
|
||
output_path: "outputs/causal_vae" | ||
|
||
# ms settting | ||
jit_level: O0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# model | ||
model_type: "OpenSoraVAE_V1_2" | ||
freeze_vae_2d: False | ||
pretrained_model_path: "outputs/vae_stage1.ckpt" | ||
|
||
# loss | ||
perceptual_loss_weight: 0.1 | ||
kl_loss_weight: 1.e-6 | ||
use_real_rec_loss: False | ||
use_z_rec_loss: True | ||
use_image_identity_loss: False | ||
mixed_strategy: "mixed_video_image" | ||
mixed_image_ratio: 0.2 | ||
|
||
# data | ||
dataset_name: "video" | ||
csv_path: "../videocomposer/datasets/webvid5_copy.csv" | ||
video_folder: "../videocomposer/datasets/webvid5" | ||
frame_stride: 1 | ||
num_frames: 17 | ||
image_size: 256 | ||
|
||
micro_frame_size: null | ||
micro_batch_size: null | ||
# flip: True | ||
|
||
# training recipe | ||
seed: 42 | ||
use_discriminator: False | ||
dtype: "bf16" | ||
batch_size: 1 | ||
clip_grad: True | ||
max_grad_norm: 1.0 | ||
start_learning_rate: 1.e-5 | ||
scale_lr: False | ||
use_recompute: True | ||
|
||
epochs: 500 | ||
ckpt_save_interval: 100 | ||
init_loss_scale: 1. | ||
|
||
scheduler: "constant" | ||
use_ema: False | ||
|
||
output_path: "outputs/vae_stage2" | ||
|
||
# ms settting | ||
jit_level: O0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# model | ||
model_type: "OpenSoraVAE_V1_2" | ||
freeze_vae_2d: False | ||
pretrained_model_path: "outputs/vae_stage2.ckpt" | ||
|
||
# loss | ||
perceptual_loss_weight: 0.1 | ||
kl_loss_weight: 1.e-6 | ||
use_real_rec_loss: True | ||
use_z_rec_loss: False | ||
use_image_identity_loss: False | ||
mixed_strategy: "mixed_video_image" # TODO: use mixed_video_random after dynamic shape adaptation | ||
mixed_image_ratio: 0.2 | ||
|
||
# data | ||
dataset_name: "video" | ||
csv_path: "../videocomposer/datasets/webvid5_copy.csv" | ||
video_folder: "../videocomposer/datasets/webvid5" | ||
Comment on lines
+17
to
+18
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 引用上级vc感觉有点奇怪,是否可以cp到当前目录 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 视频文件比较大,避免增大repo |
||
frame_stride: 1 | ||
num_frames: 33 # TODO: set 33 after dynamic shape adaptation and posterior concat fixed | ||
image_size: 256 | ||
|
||
micro_frame_size: 17 | ||
micro_batch_size: 4 | ||
# flip: True | ||
|
||
# training recipe | ||
seed: 42 | ||
use_discriminator: False | ||
dtype: "fp16" | ||
batch_size: 1 | ||
clip_grad: True | ||
max_grad_norm: 1.0 | ||
start_learning_rate: 1.e-5 | ||
scale_lr: False | ||
weight_decay: 0. | ||
use_recompute: True | ||
|
||
epochs: 400 | ||
ckpt_save_interval: 100 | ||
init_loss_scale: 1. | ||
|
||
scheduler: "constant" | ||
use_ema: False | ||
|
||
output_path: "outputs/vae_stage3" | ||
|
||
# ms settting | ||
jit_level: O0 |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--mixed_strategy
这个参数感觉有些不清晰,感觉没有表达出 video/image sample stretagy 的含义There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确实,目前是对齐torch的参数名