Is it possible to transcribe only a section of an audio file? #340

orionflame · 2024-04-10T12:59:12Z

orionflame
Apr 10, 2024

Hi,

Basically I want to identify the timing of the first word, so I will only transcribe the first 2 mins for example. Is it possible to tell stable whisper to only use that portion of an audio?

Otherwise I have to generate these clipped audio files temporarily just for stable ts.

Thanks a lot in advance.

jianfch · 2024-04-11T06:41:01Z

jianfch
Apr 11, 2024
Maintainer

There is no out-of-box way to do it with stable-ts but you can load just 120 seconds of audio then transcribe it.
Use AudioLoader to avoid loading the entire audio track.

from stable_whisper.audio import AudioLoader

audio_loader = AudioLoader('audio.mp3', buffer_size='120s')
audio_chunk = audioloader.next_chunk(0)
audio_loader.terminate()

assert audio_chunk is not None and audio_chunk.shape[-1] > 0, 'empty audio chunk'
model = stable_whisper.load_model('base')
result = model.transcribe(audio_chunk)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to transcribe only a section of an audio file? #340

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Is it possible to transcribe only a section of an audio file? #340

orionflame Apr 10, 2024

Replies: 1 comment

jianfch Apr 11, 2024 Maintainer

orionflame
Apr 10, 2024

jianfch
Apr 11, 2024
Maintainer