Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live2D with Lipsync (using audio file/link) #122

Open
wants to merge 47 commits into
base: master
Choose a base branch
from

Conversation

RaSan147
Copy link

Solving issues mentioned in #117

Copy link
Owner

@guansss guansss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for the PR! I think we are getting close but some changes are still needed as described in the comments.

I noticed that some of the code is not properly linted. After making changes to the code, please run npm run lint:fix to automatically fix the linting errors, and address any remaining errors manually. (except for the triple slash reference errors, which I will fix later)

After you finish theses changes, I'll be adding some tests to make sure this feature works as expected.

src/cubism-common/MotionManager.ts Outdated Show resolved Hide resolved
src/cubism-common/MotionManager.ts Outdated Show resolved Hide resolved
src/cubism-common/SoundManager.ts Outdated Show resolved Hide resolved
@@ -248,6 +257,11 @@ export class Cubism4InternalModel extends InternalModel {
this.coreModel.addParameterValueById(this.idParamBodyAngleX, this.focusController.x * 10); // -10 ~ 10
}


updateFacialEmotion(mouthForm: number) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could a name like setMouthForm be better? Because it's not changing the entire facial expression but only the mouth form. Also, update implies that this function will do some computations other than setting the value, so set will be more suitable here.

As a new API, this method should also be added to Cubism2InternalModel for consistency.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll test and run on the cubism 2 (well the issue is i tried and failed to set up the development env on my local system, but the github action worked fine even the codespace failed, i know my skill issue) So probably won't be able to run the npm lint (will try)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The development guide in DEVELOPMENT.md was a bit messy and I've rewritten it, now I guess there won't be problems if you follow the steps (if there is please let me know!)

It's not your issue but the codespaces being problematic with submodules, browser testing etc. So better run it locally.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot 😭

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to remove this method because setting this param is pretty straightforward and isn't really worth adding a method for it.

RaSan147 and others added 7 commits December 14, 2023 19:05
Co-authored-by: Guan <46285865+guansss@users.noreply.github.com>
Co-authored-by: Guan <46285865+guansss@users.noreply.github.com>
Co-authored-by: Guan <46285865+guansss@users.noreply.github.com>
also remove cache buster and autoplay
@guansss
Copy link
Owner

guansss commented Apr 15, 2024

Finally it's ready to merge! Before I merge it, are there any changes you would like to make or suggest?

@RaSan147
Copy link
Author

Sorry didn't notice, gimme a bit time, testing...

@RaSan147
Copy link
Author

btw can you please check the PR i've sent you on cubism folder repo...
that should fix the process not found (or you may tweak the results a bit)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vite requires terser in this version (my fresh install was not working without it, so kindly add it the deps
i fixed it with npm install terser

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add another option, force or priority, if force or higher priority, will stop current audio, otherwise current one will play.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add onFinish and onError callback as option

@RaSan147
Copy link
Author

Well, I'm gonna miss motion(...., {sound})
it was a great option (since its optional feature, removing it is kinda feeling like a bad idea)
that would help retain certain posture and motion while speaking. Also the expression and many more things are missing from the PR_version...
😥

@liyao1520
Copy link

期待,更新

@RaSan147
Copy link
Author

RaSan147 commented May 8, 2024

Gotta re-test and look for compatible way to shift from patch to official version

@liyao1520
Copy link

#150

live2d official website The demonstration video of the model can flexibly display mouth movements, and the lip-syncing looks quite natural. Demo Video

In this example model, not only can the mouth opening be set based on audio information, but vowel mouth shapes can also be set by adjusting 'ParamA', 'ParamE', 'ParamI', 'ParamO', 'ParamU'.

model.internalModel.coreModel.setParameterValueById('ParamMouthOpenY', mouthY)
model.internalModel.coreModel.setParameterValueById('ParamA', 0.3)

I feel there might be better methods to achieve lip-syncing. Can the model be set to correspond to the mouth shape based on the audio?

Also, Alibaba Cloud's TTS can output the time position of each Chinese character/English word in the audio. How can the model play the audio, and can it set the corresponding mouth shape based on the phonetic information?

Seeking guidance from the experts! 🙏

@RaSan147
Copy link
Author

Finally it's ready to merge! Before I merge it, are there any changes you would like to make or suggest?

Whenever you're ready. Thanks for all your hard works

Copy link

@mrskiro mrskiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@xiaoqiang1999
Copy link

@guansss 大佬快合并吧,期待发包❤️

@t41372
Copy link

t41372 commented Jul 27, 2024

Eagerly awaiting merge!

@tegnike
Copy link

tegnike commented Dec 28, 2024

@guansss No merge yet??

@cacard
Copy link

cacard commented Dec 30, 2024

这个是支持官方的MotionSync吗?

@RaSan147
Copy link
Author

这个是支持官方的MotionSync吗?

Nope, its not the official image to motion sync (like the official live2d module) we just used voice signal amplitude to get the lips approximate position. Also guansss sensei did some internal changes with this code (i am too noob to understand them all) but made it really good. But seems sensei is a bit busy.

@stevelizcano
Copy link

This is great work. Is there an example in this PR using the changes you have? Going to try and integrate it if it's possible.

My goal is to use real time lip syncing with the OpenAI realtime API. Would be pretty cool, but not sure if possible.

@tegnike
Copy link

tegnike commented Jan 9, 2025

@stevelizcano

Here is fork with Lipsync from @RaSan147 https://github.com/RaSan147/pixi-live2d-display

And I made a character chat app includeing Realtime API by using this fork.
Please try it https://github.com/tegnike/aituber-kit

@liyao1520
Copy link

This is great work. Is there an example in this PR using the changes you have? Going to try and integrate it if it's possible.

My goal is to use real time lip syncing with the OpenAI realtime API. Would be pretty cool, but not sure if possible.

I have published an npm package to implement live2d motionsync.

GitHub: https://github.com/liyao1520/live2d-motionSync
npm: https://www.npmjs.com/package/live2d-motionsync
Demo: https://liyao1520.github.io/live2d-motionSync/

@RaSan147
Copy link
Author

This is great work. Is there an example in this PR using the changes you have? Going to try and integrate it if it's possible.
My goal is to use real time lip syncing with the OpenAI realtime API. Would be pretty cool, but not sure if possible.

I have published an npm package to implement live2d motionsync.

GitHub: https://github.com/liyao1520/live2d-motionSync npm: https://www.npmjs.com/package/live2d-motionsync Demo: https://liyao1520.github.io/live2d-motionSync/

This is really interesting, I'd love to give it a try.

@RaSan147
Copy link
Author

This is great work. Is there an example in this PR using the changes you have? Going to try and integrate it if it's possible.

My goal is to use real time lip syncing with the OpenAI realtime API. Would be pretty cool, but not sure if possible.

There are examples at the middle of readme, you can use the model and use realtime generated voice output (google voice or open ai or edge tts) to do lipsync. Thats the basic, you can even do expressions if mentioned to AI on what motion+expression to show and parse it.
https://github.com/RaSan147/VoiceAI-Asuna/blob/main/src/page/script_bot.js
this project of mine, use almost realtime output.

@DominicStewart
Copy link

@stevelizcano

Here is fork with Lipsync from @RaSan147 https://github.com/RaSan147/pixi-live2d-display

And I made a character chat app includeing Realtime API by using this fork.

Please try it https://github.com/tegnike/aituber-kit

Does this allow you to do lip sync using streamed audio rather than just audio files? The problem with this branch, as I began exploring, is that this basically requires an audio file. Meaning audio is generated from text to speech models, if you want to do something with AI, and you'd need to essentially put it in a file and give it to this library. Being able to stream audio data to the library and have the lips move is necessary for most use cases

@RaSan147
Copy link
Author

@stevelizcano
Here is fork with Lipsync from @RaSan147 https://github.com/RaSan147/pixi-live2d-display
And I made a character chat app includeing Realtime API by using this fork.
Please try it https://github.com/tegnike/aituber-kit

Does this allow you to do lip sync using streamed audio rather than just audio files? The problem with this branch, as I began exploring, is that this basically requires an audio file. Meaning audio is generated from text to speech models, if you want to do something with AI, and you'd need to essentially put it in a file and give it to this library. Being able to stream audio data to the library and have the lips move is necessary for most use cases

i'll look into it... do you have any code that can feed a function the audio stream?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.