feat: hpcai opensora 1.2 - VAE 3D training #621

SamitHuang · 2024-08-01T09:14:49Z

What does this PR do?

Fixes # (issue)

Adds # (feature)
VAE 3D training for hpcai opensora 1.2 including:

3 stage training with different loss config
mixed video and image training

Evaluation are done on UCF-101 dataset, resulting in PSNR 29

TODOs:

train with random number of frames. Dynamic shape support will be done in next PR

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you make sure to update the documentation with your changes? E.g. record bug fixes or new features in What's New. Here are the
documentation guidelines
Did you build and run the code without any errors?
Did you report the running environment (NPU type/MS version) and performance in the doc? (better record it for data loading, model inference, or training tasks)
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@xxx

… into pr_vae1.2_train

CaitinZhao · 2024-08-09T01:36:13Z

examples/opensora_hpcai/opensora/models/vae/vae.py

@@ -51,7 +51,7 @@ def encode_with_moments_output(self, x):
        """For latent caching usage"""
        h = self.encoder(x)
        moments = self.quant_conv(h)
-        mean, logvar = self.split(moments, moments.shape[1] // 2, 1)
+        mean, logvar = mint.split(moments, moments.shape[1] // 2, 1)


Why must be mint?

hadipash · 2024-08-09T01:45:41Z

examples/opensora_hpcai/tools/convert_vae1.2.py

what's the difference with tools/convert_vae_3d.py?

rm redundant file

CaitinZhao · 2024-08-26T11:15:18Z

examples/opensora_hpcai/configs/vae/train/stage3.yaml

+scheduler: "constant"
+use_ema: False
+
+output_path: "outputs/vae_stage2"


Suggested change

output_path: "outputs/vae_stage2"

output_path: "outputs/vae_stage3"

zhanghuiyao · 2024-08-26T11:51:29Z

examples/opensora_hpcai/README.md

+
+### Data Preprocess
+If you want to train your own VAE, we need to prepare data in the csv following the [data processing](#data-processing) pipeline, then run the following commands.
+Note that you need to adjust the number of trained epochs (`epochs`) in the config file accordingly with respect to your own csv data size.


这里的意思是在数据处理的时候设置epoch size吗是否可以在实际训练的时候再repeat

zhanghuiyao · 2024-08-26T11:53:25Z

examples/opensora_hpcai/README.md

+| Model       | Context      | jit_level | Precision | BS | NPUs | Resolution(framesxHxW) | Train T. (s/step) |    PSNR   |   SSIM  |
+|:------------|:-------------|:--------|:---------:|:--:|:----:|:----------------------:|:-----------------:|:-----------------:|:-----------------:|
+| STDiT2-XL/2 | D910\*-[CANN C18(0705)](https://repo.mindspore.cn/ascend/ascend910/20240705/)-[MS2.3](https://www.mindspore.cn/install) |    O1  |    BF16   |  1 |  8   |       17x256x256      |       0.97        |    29.29      |    0.88    |


1、3个stage的性能如果有的话可以一起加一下
2、并行策略和datasink有用到的话建议也加一下

zhanghuiyao · 2024-08-26T11:55:18Z

examples/opensora_hpcai/README.md

+| STDiT2-XL/2 | D910\*-[CANN C18(0705)](https://repo.mindspore.cn/ascend/ascend910/20240705/)-[MS2.3](https://www.mindspore.cn/install) |    O1  |    BF16   |  1 |  8   |       17x256x256      |       0.97        |    29.29      |    0.88    |
+> Context: {G:GPU, D:Ascend}{chip type}-{mindspore version}.
+
+Note that we train with mixed video ang image strategy i.e. `--mixed_strategy=mixed_video_image` for stage 3 instead of random number of frames (`mixed_video_random`). Random frame training will be supported in the future.


--mixed_strategy 这个参数感觉有些不清晰，感觉没有表达出 video/image sample stretagy 的含义

确实，目前是对齐torch的参数名

zhanghuiyao · 2024-08-26T11:58:00Z

examples/opensora_hpcai/configs/vae/train/stage3.yaml

+csv_path: "../videocomposer/datasets/webvid5_copy.csv"
+video_folder: "../videocomposer/datasets/webvid5"


引用上级vc感觉有点奇怪，是否可以cp到当前目录

视频文件比较大，避免增大repo

zhanghuiyao · 2024-08-26T12:00:01Z

examples/opensora_hpcai/requirements.txt

@@ -18,3 +18,4 @@ tokenizers
 sentencepiece
 transformers
 pyav
+mindcv


是否要指定特定版本

zhanghuiyao · 2024-08-26T12:00:46Z

examples/opensora_hpcai/run_train_vae_dynamic.sh

+# dynamic shape acceleration
+export MS_DEV_ENABLE_KERNEL_PACKET=on


这个参数的做用是啥是否动态shape下都需要打开

zhanghuiyao · 2024-08-26T12:01:23Z

examples/opensora_hpcai/run_vae_recons.sh

+	# --ckpt_path models/OpenSora-VAE-v1.2/model.ckpt \
+    # --ckpt_path outputs/vae_stage2.ckpt \
+	# --device_target GPU \
+    # --crop_size 256 \
+	# --ckpt_path /home/mindocr/yx/mindone/examples/opensora_hpcai/models/v1.2/vae.ckpt \
+	# --ckpt_path /home/mindocr/yx/mindone/examples/opensora_hpcai/models/sd-vae-ft-ema.ckpt \


缩紧建议可以规整下

SamitHuang added 2 commits August 1, 2024 17:12

add vae3d training

07c32a8

rm redundant

9d1b10b

SamitHuang requested review from CaitinZhao and zhanghuiyao as code owners August 1, 2024 09:14

SamitHuang added 14 commits August 1, 2024 18:03

fix format

585111b

vae split tuple

a2af062

rewrite micro_frame_size impl, vae3d reconstruct ok

6d4d228

use make_tuple, reconstruct & vae static train ok

d404a39

better micro batch/frame size writing

38400aa

add dynamic shape script

c571585

fix empty ckpt

8c5ac03

fix empty ckpt

22f8075

debug: add dynamic shape support

1c8af61

update

18b5288

fix min(a,b) in dynamic shape

f0ab936

Merge branch 'pr_vae1.2_train' of https://github.com/samithuang/mindone…

86426e3

… into pr_vae1.2_train

default dynamic shape in script

54ac478

debug: update dyanmic shape train script

8b7c9c9

CaitinZhao reviewed Aug 9, 2024

View reviewed changes

hadipash reviewed Aug 9, 2024

View reviewed changes

SamitHuang added 2 commits August 24, 2024 12:24

Merge branch 'master' into pr_vae1.2_train

fac4272

download lpips auto

1f5aba1

SamitHuang requested a review from vigo999 as a code owner August 24, 2024 08:09

fix typo and linting

7a4bf32

SamitHuang force-pushed the pr_vae1.2_train branch from 52635a2 to 7a4bf32 Compare August 26, 2024 06:16

SamitHuang added 2 commits August 26, 2024 16:05

rm redundancy

88de7e2

small fix

a93bcb5

SamitHuang force-pushed the pr_vae1.2_train branch from 06aded8 to a93bcb5 Compare August 26, 2024 08:33

linting

2b5a31e

CaitinZhao reviewed Aug 26, 2024

View reviewed changes

CaitinZhao approved these changes Aug 26, 2024

View reviewed changes

zhanghuiyao approved these changes Aug 26, 2024

View reviewed changes

SamitHuang added 5 commits August 28, 2024 16:10

jit_level O0 for less overflow

a0b0dcc

fix docs

9979d8d

rm redundancy

6c00ed3

rm file

11b80cc

update doc

2f95e96

SamitHuang force-pushed the pr_vae1.2_train branch from c09dfb3 to 2f95e96 Compare September 26, 2024 08:14

vigo999 approved these changes Sep 26, 2024

View reviewed changes

SamitHuang added 2 commits September 27, 2024 11:29

update perf

65baffc

update config

ca06f32

SamitHuang force-pushed the pr_vae1.2_train branch from 6d06bfa to ca06f32 Compare September 27, 2024 04:14

SamitHuang added this pull request to the merge queue Sep 27, 2024

Merged via the queue into mindspore-lab:master with commit c40d733 Sep 27, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: hpcai opensora 1.2 - VAE 3D training #621

feat: hpcai opensora 1.2 - VAE 3D training #621

SamitHuang commented Aug 1, 2024 •

edited

Loading

CaitinZhao Aug 9, 2024

SamitHuang Aug 28, 2024

hadipash Aug 9, 2024

SamitHuang Aug 28, 2024

CaitinZhao Aug 26, 2024

zhanghuiyao Aug 26, 2024

SamitHuang Aug 28, 2024

zhanghuiyao Aug 26, 2024

zhanghuiyao Aug 26, 2024

SamitHuang Aug 28, 2024

zhanghuiyao Aug 26, 2024

SamitHuang Aug 28, 2024

zhanghuiyao Aug 26, 2024

SamitHuang Aug 28, 2024

zhanghuiyao Aug 26, 2024

SamitHuang Aug 28, 2024

zhanghuiyao Aug 26, 2024

SamitHuang Aug 28, 2024

	output_path: "outputs/vae_stage2"
	output_path: "outputs/vae_stage3"

		csv_path: "../videocomposer/datasets/webvid5_copy.csv"
		video_folder: "../videocomposer/datasets/webvid5"

		# dynamic shape acceleration
		export MS_DEV_ENABLE_KERNEL_PACKET=on

feat: hpcai opensora 1.2 - VAE 3D training #621

feat: hpcai opensora 1.2 - VAE 3D training #621

Conversation

SamitHuang commented Aug 1, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SamitHuang commented Aug 1, 2024 •

edited

Loading