Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

My result is not good #39

Open
Berry-Wu opened this issue Mar 27, 2023 · 9 comments
Open

My result is not good #39

Berry-Wu opened this issue Mar 27, 2023 · 9 comments

Comments

@Berry-Wu
Copy link

Berry-Wu commented Mar 27, 2023

Hi, thanks for your great work.
I trained the model using script 'python train.py --gpu 0-1 --backbone LPSKI' with Human3.6M and MPII datasets with 2 GTX 1080Ti.
My config.py is like below:

    backbone = 'LPSKI'
    
    trainset_3d = ['Human36M']
    trainset_2d = ['MPII']
    testset = 'Human36M'

    input_shape = (256, 256) 
    output_shape = (input_shape[0]//8, input_shape[1]//8)
    width_multiplier = 1.0
    depth_dim = 32
    bbox_3d_shape = (2000, 2000, 2000) # depth, height, width
    pixel_mean = (0.485, 0.456, 0.406)
    pixel_std = (0.229, 0.224, 0.225)

    ## training config
    embedding_size = 2048
    lr_dec_epoch = [17, 21]
    end_epoch = 25
    lr = 1e-3
    lr_dec_factor = 10
    batch_size = 64

    ## testing config
    test_batch_size = 32
    flip_test = True
    use_gt_info = True

    ## others
    num_thread = 12 #20
    gpu_ids = '0,1'
    num_gpus = 2
    continue_train = True

And I tested the model with test.py :

python main/test.py --gpu 0-1 --test_epoch 24-24 --backbone LPSKI

And the result is like below:

[0m Protocol 2 error (MPJPE) >> tot: 65.84
Directions: 58.14 Discussion: 66.00 Eating: 58.89 Greeting: 60.22 Phoning: 66.14 Posing: 58.50 Purchases: 57.18 Sitting: 79.16 SittingDown: 90.43 Smoking: 66.33 Photo: 74.44 Waiting: 63.51 Walking: 52.24 WalkDog: 67.90 WalkTogether: 59.40 

I wonder if it's a setup issue,During training,the loss between about epoch 13 to epoch 24 has a little change。The log is here
Lastly, I want to know where can I find your trained model? :) Looking forward to your reply!

@Berry-Wu Berry-Wu changed the title My result is so bad My result is not good Mar 27, 2023
@Berry-Wu
Copy link
Author

By the way, the params of model calculated by torchsummary is inconsistent with the paper:

embedding_size = 2048, width_mult=1 : Total params: 3.50M

In your paper, the params under the same configuration is 4.07M
I want to know what is the difference? Is there have any other special settings?
Looking forward to your reply!

@Berry-Wu
Copy link
Author

Berry-Wu commented Mar 27, 2023

After review the other issues, i find the difference between above config.py and yours is:

 output_shape: (input_shape[0]//8, input_shape[1]//8) --> (input_shape[0]//4, input_shape[1]//4)
 depth_dim = 32 --> 64

And in ski_cncat.py:

inverted_residual_setting = [
                # t, c, n, s
                [1, 64, 1, 1],  #[-1, 48, 256, 256] # from  [1, 64, 1, 2] ->  [1, 64, 1, 1]
                [6, 48, 2, 2],  #[-1, 48, 128, 128]
                [6, 48, 3, 2],  #[-1, 48, 64, 64]
                [6, 64, 4, 2],  #[-1, 64, 32, 32]
                [6, 96, 3, 2],  #[-1, 96, 16, 16]
                [6, 160, 3, 1], #[-1, 160, 8, 8]
                [6, 320, 1, 1], #[-1, 320, 8, 8]
            ]
~~~~~~~
out_channels= joint_num * cfg.depth_dim,  # from joint_num * 32 --> joint_num * cfg.depth_dim

Now , i find it match the original picture in the paper:

output shape is 64,64,1152
64 -->input_shape[0]//4
1152-->num_keypoints * depth_dim = 18 * 64

So I will check the result after training
It is recommended that you modify the code of config.py on github :)
Lastly, I still have a problem with loss which has been described above. Looking forward to your reply!

@SangbumChoi
Copy link
Owner

SangbumChoi commented Mar 27, 2023

@Berry-Wu That is true at the time when I wrote this code, I overwrited all config files (dumb mistake). Let me know the result!

@Berry-Wu
Copy link
Author

Berry-Wu commented Mar 29, 2023

@SangbumChoi I have finished the training on 2 GTX 1080Ti about 30 hours. After changing the config, the GPU memory occupancy is so big, so i change the bacth_size to 32
I test the model on epoch 23 and 24 like this:

python main/test.py --gpu 0-1 --test_epoch 23-24 --backbone LPSKI

The log is here. And the result is below:

>>> Using GPU: 0,1
Load data of H36M Protocol 2
creating index...
index created!
Get bounding box and root from groundtruth
============================================================
LPSKI BackBone Generated
============================================================
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 136/136 [01:54<00:00,  1.18it/s]
Evaluation start...
Protocol 2 error (MPJPE) >> tot: 60.26
Directions: 55.26 Discussion: 60.47 Eating: 53.43 Greeting: 57.23 Phoning: 60.88 Posing: 52.77 Purchases: 55.38 Sitting: 73.62 SittingDown: 80.20 Smoking: 59.51 Photo: 66.54 Waiting: 56.99 Walking: 47.51 WalkDog: 62.86 WalkTogether: 54.66 
Test result is saved at /home/data3_4t/wzy/codes/MobileHumanPose/main/../output/result/bbox_root_pose_human36m_output.json
03-29 15:33:15 Protocol 2 error (MPJPE) >> tot: 60.26
Directions: 55.26 Discussion: 60.47 Eating: 53.43 Greeting: 57.23 Phoning: 60.88 Posing: 52.77 Purchases: 55.38 Sitting: 73.62 SittingDown: 80.20 Smoking: 59.51 Photo: 66.54 Waiting: 56.99 Walking: 47.51 WalkDog: 62.86 WalkTogether: 54.66 
============================================================
LPSKI BackBone Generated
============================================================
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 136/136 [00:53<00:00,  2.55it/s]
Evaluation start...
Protocol 2 error (MPJPE) >> tot: 60.51
Directions: 55.71 Discussion: 60.89 Eating: 53.79 Greeting: 57.65 Phoning: 61.38 Posing: 53.04 Purchases: 55.66 Sitting: 74.17 SittingDown: 79.73 Smoking: 59.67 Photo: 66.65 Waiting: 56.86 Walking: 47.60 WalkDog: 63.12 WalkTogether: 54.74 
Test result is saved at /home/data3_4t/wzy/codes/MobileHumanPose/main/../output/result/bbox_root_pose_human36m_output.json
03-29 15:34:11 Protocol 2 error (MPJPE) >> tot: 60.51
Directions: 55.71 Discussion: 60.89 Eating: 53.79 Greeting: 57.65 Phoning: 61.38 Posing: 53.04 Purchases: 55.66 Sitting: 74.17 SittingDown: 79.73 Smoking: 59.67 Photo: 66.65 Waiting: 56.86 Walking: 47.60 WalkDog: 63.12 WalkTogether: 54.74 

As you can see, the result on Protocol 2 is about 60.51mm.
In your paper, the result of large model is 51.4mm.
I don't know how to fill the gap :(

By the way, the param of calculated by torchsummary is 3.64M, in your paper is 4.07M. I don't know what‘s the gap.
Could you help me? Looking forward your relpy! :)

@SonNguyen2510
Copy link

@Berry-Wu Do you have any update on this? I really want to know If you can reproduce the result on the paper or not, I cannot match the setting as the paper said

@Berry-Wu
Copy link
Author

@SonNguyen2510 Sorry, I didn't reproduce the result of the paper. My result is above, which has a gap with the paper. After several modifications,I think my config is consistent with the original paper. You can refer the config above. I hope it will help you! :)
Besides, the author provides the pretrained model in there, you can test on it. I havn't do it.
https://drive.google.com/drive/folders/146ZFPZyFyRQejB8CBYZ_R26NEXEO4EjI?usp=share_link

@SonNguyen2510
Copy link

SonNguyen2510 commented Apr 11, 2023

@Berry-Wu thank you for your reply, in order to test the pretrain model, I think I need to match it configuration. Do you know what is the config of that model? Is that the config above? thanks again

@Berry-Wu
Copy link
Author

@SonNguyen2510 Sorry, I don't know. :( You can refer this issue: #30
It seems that the author just uploaded random pth files.

@SonNguyen2510
Copy link

@Berry-Wu it's ok, thank you anyway :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants