Skip to content

Using Baidu ASR auto-generating subtitles for any video file. 使用百度短语音识别技术为视频或音频生成字幕。

License

Notifications You must be signed in to change notification settings

RyouMon/autosub-baidu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Autosub-baidu

Auto-generated subtitles for any video

Autosub 是用于自动语音识别和字幕生成的实用程序。

但是这个项目已经不再维护了,而且在中国使用有些困难。 于是我克隆了这个项目的代码,在内部把Google Web Speech API替换为了百度短语音识别API

项目结构也做出了一些修改,使其即可以在命令行使用,也可以在其他代码中导入使用(主要目的是方便在代码中使用)。语言也改为了仅支持中文。

运行参数有些不一样,要想顺利使用这个新项目你必须提供API_KEYSECRET_KEYAPP_ID。这里贴出了一些链接,可以跟随这些链接去自行探索:

  1. 短语音识别标准版产品首页
  2. 百度AI开放平台语音技术新手入门 (通过阅读可以了解如何获得API_KEYSECRET_KEYAPP_ID

Autosub is a utility for automatic speech recognition and subtitle generation.

But this project is no longer maintained, and it is somewhat difficult to use in China. So I cloned the code of this project and replaced Google Web Speech API with Baidu Short Speech Recognition API internally.

Now Autosub-baidu only supports Chinese.

安装 Installation

  1. 安装 ffmpeg.
  2. 运行 pip install autosub-baidu.

用法 Usage

从命令行使用(command line):

python -m autosubb -h
usage: __main__.py [-h] [-C CONCURRENCY] [-o OUTPUT] [-F FORMAT] [-L LANG] [-K API_KEY] [-S SECRET_KEY] [-A APP_ID]
                   [--list-formats] [--list-languages]
                   [source_path]

positional arguments:
  source_path           Path to the video or audio file to subtitle

optional arguments:
  -h, --help            show this help message and exit
  -C CONCURRENCY, --concurrency CONCURRENCY
                        Number of concurrent API requests to make
  -o OUTPUT, --output OUTPUT
                        Output path for subtitles (by default, subtitles are saved in the same directory and name as
                        the source path)
  -F FORMAT, --format FORMAT
                        Destination subtitle format
  -L LANG, --lang LANG  Language spoken in source file. default 1537.
  -K API_KEY, --api-key API_KEY
                        The Baidu Cloud API key to be used.
  -S SECRET_KEY, --secret-key SECRET_KEY
                        The Baidu Cloud Secret Key to be used.
  -A APP_ID, --app-id APP_ID
                        The Baidu Cloud AppID to be used.
  --list-formats        List all available subtitle formats
  --list-languages      List all available source/destination languages

从代码使用:

import sys

from autosubb import generate_subtitles

APP_ID = '2*****0'
API_KEY = 'W********************o'
SECRET_KEY = 'X******************************5'

filename = 'audio.mp3'


def main():
    subtitles = generate_subtitles(
        source_path=filename,
        app_id=APP_ID,
        api_key=API_KEY,
        secret_key=SECRET_KEY,
        concurrency=2,
        dev_pid='80001',
    )
    print(subtitles)


if __name__ == '__main__':
    sys.exit(main())
[((2.3040000000000003, 4.608000000000002), '从前有一对仙人夫妻。'),
 ((5.376000000000003, 7.424000000000005), '他们常常到山顶上下棋。'),
...
 ((56.83200000000004, 59.392000000000046), '猴子的眼睛一直盯着这盘水蜜桃。'),
 ((60.41600000000005, 63.48800000000005), '战胜对手,其实就是战胜对手的弱点。')]

从代码中生成SRT文件

generate_subtitles函数中指定subtitle_file_format参数:

import sys

from autosubb import generate_subtitles

APP_ID = '2*****0'
API_KEY = 'W********************o'
SECRET_KEY = 'X******************************5'

filename = 'audio.mp3'


def main():
    subtitles = generate_subtitles(
        source_path=filename,
        app_id=APP_ID,
        api_key=API_KEY,
        secret_key=SECRET_KEY,
        concurrency=2,
        dev_pid='80001',
        subtitle_file_format='srt',
    )

    with open('audio.srt', 'wb') as out:
        out.write(subtitles.encode('utf-8'))


if __name__ == '__main__':
    sys.exit(main())

如果你想拿到列表形式的字幕(做一些事情),并在稍后保存为SRT文件, 你可以使用srt_formatter函数:

import sys

from autosubb import generate_subtitles
from autosubb import formatters

APP_ID = '2*****0'
API_KEY = 'W********************o'
SECRET_KEY = 'X******************************5'

filename = 'audio.mp3'


def main():
    subtitles = generate_subtitles(
        source_path=filename,
        app_id=APP_ID,
        api_key=API_KEY,
        secret_key=SECRET_KEY,
        concurrency=2,
        dev_pid='80001',
    )

    print(subtitles)  # do something

    with open('audio.srt', 'wb') as out:
        content = formatters.srt_formatter(subtitles)
        out.write(content.encode('utf-8'))


if __name__ == '__main__':
    sys.exit(main())

License

MIT

About

Using Baidu ASR auto-generating subtitles for any video file. 使用百度短语音识别技术为视频或音频生成字幕。

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages