3881-Hours-Mandarin-Spontaneous-Speech-Data

Description

3,881 hours - Mandarin Spontaneous Speech Data, the content covering multiple subjects. All the speech audio was manually transcribed into text content; speaker identity, gender, and other attribution are also annotated. This dataset can be used for voiceprint recognition model training, corpus construction for machine translation, and algorithm research introduction, etc.

For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/1024?source=Github

Specifications

Format

16kHz, 16bit, wav, mono channel;

Content Category

Interview; Sports; Variety; Course; Entertainment, Service, etc.

Annotation

annotation for the transcription text, speaker identification, gender

Language

Mandarin

Accuracy

at a Sentence Accuracy Rate (SAR) of being no less than 95%

Application scenarios

speech recognition, video caption generation and video content review

Licensing Information

Commercial License

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
01_06097_0001.txt		01_06097_0001.txt
01_06097_0001.wav		01_06097_0001.wav
07_07849_0071.txt		07_07849_0071.txt
07_07849_0071.wav		07_07849_0071.wav
07_07849_0131.txt		07_07849_0131.txt
07_07849_0131.wav		07_07849_0131.wav
07_07849_0170.txt		07_07849_0170.txt
07_07849_0170.wav		07_07849_0170.wav
07_07849_0286.txt		07_07849_0286.txt
07_07849_0286.wav		07_07849_0286.wav
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3881-Hours-Mandarin-Spontaneous-Speech-Data

Description

Specifications

Format

Content Category

Annotation

Language

Accuracy

Application scenarios

Licensing Information

About

Releases

Packages

Nexdata-AI/3881-Hours-Mandarin-Spontaneous-Speech-Data

Folders and files

Latest commit

History

Repository files navigation

3881-Hours-Mandarin-Spontaneous-Speech-Data

Description

Specifications

Format

Content Category

Annotation

Language

Accuracy

Application scenarios

Licensing Information

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages