-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add genai notebook for whisper scenario #2406
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
openvino.genai was built from PR openvinotoolkit/openvino.genai#883 and install openvino.genai by this way:
|
52fee7c
to
a905cb0
Compare
View / edit / reply to this conversation on ReviewNB eaidova commented on 2024-09-25T15:29:13Z I'm not sure that restriction for torch version is still actual. we set it due to torch 2.4 windows issue in optimum inference pipeline, for genai it is not actual. |
View / edit / reply to this conversation on ReviewNB eaidova commented on 2024-09-25T15:29:13Z please add check that file already downloaded sbalandi commented on 2024-09-26T22:43:15Z done |
View / edit / reply to this conversation on ReviewNB eaidova commented on 2024-09-25T15:29:14Z I think it makes sence to demonstrate data reading without hf datasets too (it is not obvious how to prepare input data for this case) |
View / edit / reply to this conversation on ReviewNB eaidova commented on 2024-09-25T15:29:15Z Line #5. can we disable warning or somehow preformat output? it is hard to understated where warning text finished and answer started sbalandi commented on 2024-09-26T22:58:09Z fixed |
View / edit / reply to this conversation on ReviewNB eaidova commented on 2024-09-25T15:29:16Z I'm not sure that weight compression makes sense for whisper. it has benefit for LLMs as they have large weights stored in matmul weights, even whisper large has less then 1b parameters. Please check optimum-based notebook to see suggested quantization approach |
View / edit / reply to this conversation on ReviewNB eaidova commented on 2024-09-25T15:29:17Z Line #3. device = device_widget(default="CPU", exclude=["AUTO", "GPU"]) is GPU really not supported? sbalandi commented on 2024-09-26T22:44:12Z remove excluding eaidova commented on 2024-09-30T16:32:20Z I belive for now we still need device excluding for NPU (until static pipeline for NPU will not be added in GenAI) sbalandi commented on 2024-09-30T19:14:16Z added |
View / edit / reply to this conversation on ReviewNB eaidova commented on 2024-09-25T15:29:17Z Line #2. import gradio as gr I think the goal of gradio helper to hide all gradio demo necessary logic, maybe better to move it into gradio_helper? |
View / edit / reply to this conversation on ReviewNB aleksandr-mokrov commented on 2024-09-25T16:34:27Z For better clarity, it should be added that the translation will be into English and you can specify the source language |
done View entire conversation on ReviewNB |
remove excluding View entire conversation on ReviewNB |
changed to reading wav file |
fixed View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB as-suvorov commented on 2024-09-30T10:57:05Z Should we provide meaningful data in the output instead of exception? eaidova commented on 2024-09-30T16:34:46Z as you have significant accuracy drop related to small models, I think maybe better to not demonstrate accuracy validation (possibly it can be improved changing smooth_quant_alpha parameter for quantization) sbalandi commented on 2024-09-30T19:16:46Z changed mooth_quant_alpha to 0.80 for encoder, thank you ! |
View / edit / reply to this conversation on ReviewNB as-suvorov commented on 2024-09-30T10:57:06Z Debug print? |
3b5ddad
to
bb5604b
Compare
I belive for now we still need device excluding for NPU (until static pipeline for NPU will not be added in GenAI) View entire conversation on ReviewNB |
as you have significant accuracy drop related to small models, I think maybe better to not demonstrate accuracy validation (possibly it can be improved changing smooth_quant_alpha parameter for quantization) View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB eaidova commented on 2024-09-30T16:35:00Z Line #3. %pip install --pre -U openvino openvino-tokenizers openvino-genai --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly -q missed in this linke sbalandi commented on 2024-09-30T19:14:05Z added |
52ace1d
to
ab12094
Compare
added View entire conversation on ReviewNB |
added View entire conversation on ReviewNB |
changed mooth_quant_alpha to 0.80 for encoder, thank you ! View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB aleksandr-mokrov commented on 2024-10-01T11:02:01Z Line #21. quantized_model_path = Path(f"{model_path}_quantized") -quantized is better, otherwise the pass will be whisper-large-v3_quantized |
Task CVS-147995