-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to perceive improvement. #17
Comments
Hey, A better way to compare SPECTRE with another method would be by rendering the out 3D mesh in a video and comparing the two visually. Also a final note: in some cases the lipread loss will even exaggerate the mouth a bit (e.g. add more protrusion and roundedness than visible) in order to better capture the perception of speech, which will result in even worse landmark placement compared to other methods. |
Hi @daikankan can you please explain how you are pasting back the rendered avatar back in the video? |
just opencv: circle rectangle and puttext |
Thanks for sharing this work, good insight and inspiring.
But I'm unable to perceive improvement of the pretrained model.
My inference of E_expression:
For images, inputs: just concat same image (1, 5, 224, 224), output: Flame parameters (5, 53), chose the center one's param ouput[2, :]
For videos, inputs: just concat same frame (1, 5, 224, 224), output: Flame parameters (5, 53), chose the center one's param ouput[2, :]. Or maybe I should try to concat continous 5 frames for E_expression input ?
My mean_shape for alignment is consist with author's.
Comparison of the results (talkinghead videos and single image reconstruction) between E_flame_without_E_expression and E_flame_with_expression:
E_flame_without_E_expression:
talkinghead_E_flame_without_E_expression.mp4
E_flame_with_expression:
talkinghead_E_flame_with_E_expression.mp4
Sorry, my test maybe not sufficient,and my preprocess maybe not accurate.
The text was updated successfully, but these errors were encountered: