Replies: 6 comments
-
>>> r-wei |
Beta Was this translation helpful? Give feedback.
-
>>> kdavis |
Beta Was this translation helpful? Give feedback.
-
>>> yv001 |
Beta Was this translation helpful? Give feedback.
-
>>> kdavis |
Beta Was this translation helpful? Give feedback.
-
>>> r-wei |
Beta Was this translation helpful? Give feedback.
-
>>> lissyx |
Beta Was this translation helpful? Give feedback.
-
>>> r-wei
[February 4, 2018, 9:18pm]
Hi, I'm using the deepspeech python package to make a speech-to-text
inference on a wav file. I see some output like this:
Loading model from file models/output_graph.pb
Loaded model in 0.249s.
Loading language model from files models/lm.binary models/trie
Loaded language model in 1.428s.
Running inference.
yes
Inference took 9.582s for 5.000s audio file.
Does this mean that the model predicted that my wav file contained the
word 'yes'? Is there an estimated confidence/accuracy score on this
prediction? Were any other files with prediction information created?
Is it possible to get timestamps on the predicted text? For example, the
model predicted that the wav file contained someone saying the word
'yes' starting at 1.000sec and ending at 1.010 sec.
[This is an archived TTS discussion thread from discourse.mozilla.org/t/what-prediction-information-is-available-from-deepspeech-inference]
Beta Was this translation helpful? Give feedback.
All reactions