about the input/output size #8

Danee-wawawa · 2023-05-12T06:45:18Z

Hi，thank you for your work.
Does the input size and output size support other sizes, such as 640512 or 512512? If possible, where the code needs to be modified?
Looking forward to your answer.

usert5432 · 2023-05-14T20:12:33Z

Hi @Danee-wawawa,

Does the input size and output size support other sizes, such as 640512 or 512512?

This depends on several factors. In the simplest case -- if your data is images, and you would like to perform the translation between the images of the same size (e.g. 512 x 512 -> 512 x 512), then this case is supported by uvcgan2.

If possible, where the code needs to be modified?

In a case, as I have described above, one would need to modify the data configuration of the training script. Taking male2female script as an example:

uvcgan2/scripts/celeba_hq/train_m2f_translation.py

Lines 63 to 76 in 8f4b1cb

    
           'datasets' : [ 
        
               { 
        
                   'dataset' : { 
        
                       'name'   : 'image-domain-hierarchy', 
        
                       'domain' : domain, 
        
                       'path'   : 'celeba_hq_resized_lanczos', 
        
                   }, 
        
                   'shape'           : (3, 256, 256), 
        
                   'transform_train' : [ 
        
                       'random-flip-horizontal', 
        
                   ], 
        
                   'transform_test' : None, 
        
               } for domain in [ 'male', 'female' ] 
        
           ],

One would need to modify shape parameter to match the desired shape, e.g. 'shape' : (3, 512, 512) .

If your case is more complicated, more modifications may be required. Please let me know if you have further questions.

Danee-wawawa · 2023-05-15T02:50:53Z

Thank you for your reply. Now,512 x 512 -> 512 x 512 is OK and I want to try 640 x 512 -> 640 x 512. Does this situation require modifying the network structure?

usert5432 · 2023-05-15T16:02:13Z

I want to try 640 x 512 -> 640 x 512. Does this situation require modifying the network structure?

No, you do not need to modify the network structure for 640 x 512 images (modifying the shape parameter should be enough). In general, as long as your image dimensions are divisible by 16, you can use the default network structure.

With that said, it may be helpful to tune the network structure a bit to achieve the best performance, but it is not necessary.

Danee-wawawa · 2023-05-19T09:02:50Z

OK, thank you~~

Pudding-0503 · 2023-06-08T12:25:55Z

你好@usert5432,

人们需要修改shape参数以匹配所需的形状，例如'shape' : (3, 512, 512).

如果您的情况更复杂，则可能需要进行更多修改。如果您还有其他问题，请告诉我。

Sorry to bother you, I also have a question about image size. If my image shape is (3,512,512), then the

  'shape' : (3, 256, 256),

changed to

  'shape' : (3, 512, 512),

The following three lines:

  'transform_train' : [
    { 'name' : 'resize', 'size' : 286, },
    { 'name' : 'random-crop', 'size' : 256, },

Does it need to be changed accordingly to

  'transform_train' : [
     { 'name' : 'resize', 'size' : 512, },
     { 'name' : 'random-crop', 'size' : 256, },

Or what about other numbers?

usert5432 · 2023-06-08T21:41:24Z

Hi @Pudding-0503,

The data transformations are heavily dependent on the dataset that you have. For instance, if you have a large dataset (>= 5k images). And, if the objects that you want to translate have approximately the same size. Then, perhaps, you do not need to apply any transformations at all (or limit them just to a random horizontal flip).

And, in general, I would suggest to start with only random-horizontal-flip transformation, e,g.

                'transform_train' : [
                    'random-flip-horizontal',
                ],

And see how the translation works. If it does not work, then adjust the network hyperparameters (as described in the README). And, if it still does not work, then add new transformations.

Pudding-0503 · 2023-06-09T00:42:23Z

OK! I got it, thank you very much~~~

Danee-wawawa · 2023-07-14T02:09:10Z

Hi, I want to try 640 x 512 -> 640 x 512. And I modify the data configuration of the training script as following:

But I get the following error:

How to solve this problem?
Looking forward to your answer.

usert5432 · 2023-07-14T21:22:54Z

Hi @Danee-wawawa. Can you try switching the order of dimensions? That is, setting shape = (3, 512, 640) instead of (3, 640, 512).

Danee-wawawa · 2023-07-20T02:59:02Z

It is OK, thank you~

sophiatmu · 2023-12-14T13:11:52Z

Hi, I want to try 14601080 ->14601080, but I got this:

How to solve this problem?

Thank you for your reply

usert5432 · 2023-12-16T18:37:55Z

Hi @sophiatmu,

Unfortunately, I do not think uvcgan will work on images of size 1460 x 1080. It expects image dimensions to be divisible by 16, but neither 1460 nor 1080 are divisible.

create-li · 2025-01-15T12:14:00Z

Hi @Danee-wawawa. Can you try switching the order of dimensions? That is, setting instead of .shape = (3, 512, 640)``(3, 640, 512)

Excuse me, is it true that higher resolution results in better image translation and higher metrics

usert5432 · 2025-01-15T23:55:06Z

Hello @create-li

Excuse me, is it true that higher resolution results in better image translation and higher metrics

Unfortunately, we have not tried training on the higher resolution images, so I am not sure. Perhaps, somebody in this thread could provide feedback on higher resolution training.

usert5432 self-assigned this Dec 16, 2023

usert5432 added the question Further information is requested label Dec 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about the input/output size #8

about the input/output size #8

Danee-wawawa commented May 12, 2023

usert5432 commented May 14, 2023 •

edited

Loading

Danee-wawawa commented May 15, 2023

usert5432 commented May 15, 2023 •

edited

Loading

Danee-wawawa commented May 19, 2023

Pudding-0503 commented Jun 8, 2023 •

edited

Loading

usert5432 commented Jun 8, 2023

Pudding-0503 commented Jun 9, 2023

Danee-wawawa commented Jul 14, 2023

usert5432 commented Jul 14, 2023

Danee-wawawa commented Jul 20, 2023

sophiatmu commented Dec 14, 2023 •

edited

Loading

usert5432 commented Dec 16, 2023

create-li commented Jan 15, 2025

usert5432 commented Jan 15, 2025

about the input/output size #8

about the input/output size #8

Comments

Danee-wawawa commented May 12, 2023

usert5432 commented May 14, 2023 • edited Loading

Danee-wawawa commented May 15, 2023

usert5432 commented May 15, 2023 • edited Loading

Danee-wawawa commented May 19, 2023

Pudding-0503 commented Jun 8, 2023 • edited Loading

usert5432 commented Jun 8, 2023

Pudding-0503 commented Jun 9, 2023

Danee-wawawa commented Jul 14, 2023

usert5432 commented Jul 14, 2023

Danee-wawawa commented Jul 20, 2023

sophiatmu commented Dec 14, 2023 • edited Loading

usert5432 commented Dec 16, 2023

create-li commented Jan 15, 2025

usert5432 commented Jan 15, 2025

usert5432 commented May 14, 2023 •

edited

Loading

usert5432 commented May 15, 2023 •

edited

Loading

Pudding-0503 commented Jun 8, 2023 •

edited

Loading

sophiatmu commented Dec 14, 2023 •

edited

Loading