-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: hpcai opensora 1.2 - VAE 3D training #621
Conversation
… into pr_vae1.2_train
@@ -51,7 +51,7 @@ def encode_with_moments_output(self, x): | |||
"""For latent caching usage""" | |||
h = self.encoder(x) | |||
moments = self.quant_conv(h) | |||
mean, logvar = self.split(moments, moments.shape[1] // 2, 1) | |||
mean, logvar = mint.split(moments, moments.shape[1] // 2, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why must be mint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the difference with tools/convert_vae_3d.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm redundant file
52635a2
to
7a4bf32
Compare
06aded8
to
a93bcb5
Compare
scheduler: "constant" | ||
use_ema: False | ||
|
||
output_path: "outputs/vae_stage2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output_path: "outputs/vae_stage2" | |
output_path: "outputs/vae_stage3" |
examples/opensora_hpcai/README.md
Outdated
|
||
### Data Preprocess | ||
If you want to train your own VAE, we need to prepare data in the csv following the [data processing](#data-processing) pipeline, then run the following commands. | ||
Note that you need to adjust the number of trained epochs (`epochs`) in the config file accordingly with respect to your own csv data size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的意思是在数据处理的时候设置epoch size吗 是否可以在实际训练的时候再repeat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
examples/opensora_hpcai/README.md
Outdated
| Model | Context | jit_level | Precision | BS | NPUs | Resolution(framesxHxW) | Train T. (s/step) | PSNR | SSIM | | ||
|:------------|:-------------|:--------|:---------:|:--:|:----:|:----------------------:|:-----------------:|:-----------------:|:-----------------:| | ||
| STDiT2-XL/2 | D910\*-[CANN C18(0705)](https://repo.mindspore.cn/ascend/ascend910/20240705/)-[MS2.3](https://www.mindspore.cn/install) | O1 | BF16 | 1 | 8 | 17x256x256 | 0.97 | 29.29 | 0.88 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1、3个stage的性能如果有的话可以一起加一下
2、并行策略和datasink有用到的话建议也加一下
| STDiT2-XL/2 | D910\*-[CANN C18(0705)](https://repo.mindspore.cn/ascend/ascend910/20240705/)-[MS2.3](https://www.mindspore.cn/install) | O1 | BF16 | 1 | 8 | 17x256x256 | 0.97 | 29.29 | 0.88 | | ||
> Context: {G:GPU, D:Ascend}{chip type}-{mindspore version}. | ||
|
||
Note that we train with mixed video ang image strategy i.e. `--mixed_strategy=mixed_video_image` for stage 3 instead of random number of frames (`mixed_video_random`). Random frame training will be supported in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--mixed_strategy
这个参数感觉有些不清晰,感觉没有表达出 video/image sample stretagy 的含义
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确实,目前是对齐torch的参数名
csv_path: "../videocomposer/datasets/webvid5_copy.csv" | ||
video_folder: "../videocomposer/datasets/webvid5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
引用上级vc感觉有点奇怪,是否可以cp到当前目录
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
视频文件比较大,避免增大repo
@@ -18,3 +18,4 @@ tokenizers | |||
sentencepiece | |||
transformers | |||
pyav | |||
mindcv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是否要指定特定版本
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
# dynamic shape acceleration | ||
export MS_DEV_ENABLE_KERNEL_PACKET=on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个参数的做用是啥 是否动态shape下都需要打开
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
# --ckpt_path models/OpenSora-VAE-v1.2/model.ckpt \ | ||
# --ckpt_path outputs/vae_stage2.ckpt \ | ||
# --device_target GPU \ | ||
# --crop_size 256 \ | ||
# --ckpt_path /home/mindocr/yx/mindone/examples/opensora_hpcai/models/v1.2/vae.ckpt \ | ||
# --ckpt_path /home/mindocr/yx/mindone/examples/opensora_hpcai/models/sd-vae-ft-ema.ckpt \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
缩紧建议可以规整下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm
c09dfb3
to
2f95e96
Compare
6d06bfa
to
ca06f32
Compare
What does this PR do?
Fixes # (issue)
Adds # (feature)
VAE 3D training for hpcai opensora 1.2 including:
Evaluation are done on UCF-101 dataset, resulting in PSNR 29
TODOs:
Before submitting
What's New
. Here are thedocumentation guidelines
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@xxx