example input videos #2

jimmyl02 · 2024-01-29T06:16:15Z

Hello,

Thanks for the amazing work! I was wondering if you have some sample videos that we could use with the inference script to play around with. I haven't had the chance to read through the code yet but I'm guessing the model was trained with a certain aspect ratio and resolution. Would love to be able to recreate some of the examples from the landing page.

Thanks! Would appreciate any insights.

fanfanda · 2024-01-29T08:54:39Z

Our M3DDM can take input videos of any resolution and any target_ratio_list, but the output video will have its longest edge at a resolution of 256. This is due to our training dataset being resized to a 256x256 resolution. Furthermore, the input target_ratio_list must be different from the aspect ratio of the input video, as this is what makes the process meaningful. You can download videos from the YouTube-VOS dataset and use our script with a 1:1 target_ratio_list to extend the video. Below is an example created using the video 13006c4c7e.mp4 from YouTube-VOS, with a 1:1 target_ratio_list.

Source Video	Output Result
Video1

iGerman00 · 2024-02-03T15:57:40Z

It's a very interesting paper. With some additional work such as masking in post to remove any really annoying flashing and temporal inconsistency, color correction to match the original high-resolution video, as well as an upscale with a model like Real-ESRGAN or Topaz' products, I was able to composite an extension for an iconic 4:3 video to 16:9. I used the pre-trained weights provided in the README. Took very long on a 3090, though, about 10 hours for this video.

Bad.Apple.16.9.Ai.Lores.nitro.mp4

fanfanda · 2024-02-03T16:44:23Z

Wow, this is a really nice result! Thanks for your contributions.

We are also continuing to optimize our video outpainting model. In the future, we will support output for high-resolution videos without the need for subsequent super-resolution models. Regarding inference speed, currently you can manually remove the branch for global video frames, or use a stride of [15, 1] instead of [15, 5, 1] for a speed-up, though be aware that this might result in some loss of quality. Alternatively, you could use existing distillation algorithms to speed up our model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

example input videos #2

example input videos #2

jimmyl02 commented Jan 29, 2024

fanfanda commented Jan 29, 2024 •

edited

Loading

iGerman00 commented Feb 3, 2024

fanfanda commented Feb 3, 2024 •

edited

Loading

example input videos #2

example input videos #2

Comments

jimmyl02 commented Jan 29, 2024

fanfanda commented Jan 29, 2024 • edited Loading

iGerman00 commented Feb 3, 2024

fanfanda commented Feb 3, 2024 • edited Loading

fanfanda commented Jan 29, 2024 •

edited

Loading

fanfanda commented Feb 3, 2024 •

edited

Loading