Skip to content

Commit

Permalink
fix for video_frames with iterations_per_frame < 5
Browse files Browse the repository at this point in the history
  • Loading branch information
rkhamilton committed Nov 4, 2021
1 parent 4e9ac4a commit a317fe6
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 13 deletions.
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
# v2.1.1
**Bug Fixes**
* generate.video_frames was not working for low iterations_per_frame. This is no corrected for non-zooming videos. As long as zoom_scale==1.0, and shift_x, and shift_y, are 0, you can freely set iterations_per_frame to 1 and get expected results.

**Known Issues**
There is still an issue with iterations_per_frame < ~5 when using zoom_scale==1.0, and shift_x, and shift_y. It takes more iterations_per_frame than expected to see progress in the result. For the time being, use a higher iterations_per_frame if you are using these parameters, than if you are not.

# v2.1.0
This release adds support for multiple export filetypes in addition to PNG. Exports to jpeg or PNG will have metadata embedded that describe the media generation settings. PNG files have already had metadata stored in PNG data chunks. JPG files, available in 2.1, have metadata stored in the exif fields XPTitle and XPComment. Other export filetypes are supported for still images, provded they are [types supported by Pillow](https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html).

Expand Down
26 changes: 13 additions & 13 deletions src/vqgan_clip/generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -338,19 +338,19 @@ def video_frames(num_video_frames,
eng.encode_and_append_prompts(current_prompt_number, parsed_text_prompts, parsed_image_prompts, parsed_noise_prompts)

# Zoom / shift the generated image
pil_image = TF.to_pil_image(eng.output_tensor[0].cpu())
if zoom_scale != 1.0:
new_pil_image = VF.zoom_at(pil_image, output_image_size_x/2, output_image_size_y/2, zoom_scale)
else:
new_pil_image = pil_image

if shift_x or shift_y:
new_pil_image = ImageChops.offset(new_pil_image, shift_x, shift_y)

# Re-encode and use this as the new initial image for the next iteration
eng.convert_image_to_init_image(new_pil_image)

eng.configure_optimizer()
if zoom_scale != 1.0 or shift_x or shift_y:
pil_image = TF.to_pil_image(eng.output_tensor[0].cpu())
if zoom_scale != 1.0:
new_pil_image = VF.zoom_at(pil_image, output_image_size_x/2, output_image_size_y/2, zoom_scale)
else:
new_pil_image = pil_image

if shift_x or shift_y:
new_pil_image = ImageChops.offset(new_pil_image, shift_x, shift_y)
# Re-encode and use this as the new initial image for the next iteration
eng.convert_image_to_init_image(new_pil_image)
eng.configure_optimizer()

if verbose:
# display some statistics about how the GAN training is going whever we save an interim image
Expand Down

0 comments on commit a317fe6

Please sign in to comment.