The 163 Hours - Russian Child's Spontaneous Speech Data is a collection of speech clips, the content covering multiple topics. All the speech audio was manually transcribed into text content; speaker identity, gender, and other attribution are also annotated. This dataset can be used for voiceprint recognition model training, corpus construction for machine translation, and algorithm research introduction
For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/1308?source=Github
16kHz, 16bit, mono channel;
children aged 12 and under;
including interview, self-meida,variety show, etc.
Russian;
annotation for the transcription text, speaker identification, gender;
video caption generation and video content review;
at a Word Accuracy Rate (SAR) of being no less than 98%
Commercial License