Skip to content

Segments a .mp3 file into several smaller audio clips using an accompanying .srt closed captioning file.

Notifications You must be signed in to change notification settings

AlanLiu96/srt-parse

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

srt-parse

Segments an audio file into several smaller audio clips using an accompanying .srt closed captioning file.

Usage

usage: srt-parse [-h] [--output-dir OUTPUT_DIR]
             [--audio-out-file-pattern AUDIO_OUT_FILE_PATTERN]
             [--text-out-file-pattern TEXT_OUT_FILE_PATTERN]
             [--output-type {txt,csv}] [--csv-seperator CSV_SEPERATOR]
             [--csv-filename CSV_FILENAME]
             [--update-increment UPDATE_INCREMENT]
             [--in-encoding IN_ENCODING] [--out-encoding OUT_ENCODING]
             audio_input srt_input

Segment audio files according to a provided .srt closed caption file

positional arguments:
  audio_input           Location of audio file to be processed
  srt_input             Location of .srt file to be processed

optional arguments:
  -h, --help            show this help message and exit
  --output-dir OUTPUT_DIR
                        Directory for processed files to be saved to
  --audio-out-file-pattern AUDIO_OUT_FILE_PATTERN
                        A python-style f-string for saving audio files
  --text-out-file-pattern TEXT_OUT_FILE_PATTERN
                        A python-style f-string for saving text files
  --output-type {txt,csv}
                        Output filetype
  --csv-seperator CSV_SEPERATOR
                        Character sequence used to seperate values in csv
  --csv-filename CSV_FILENAME
                        Name of file to write as csv
  --update-increment UPDATE_INCREMENT
                        Print progress after every specified amount of
                        segments.
  --in-encoding IN_ENCODING
                        Encoding used to read the .srt file
  --out-encoding OUT_ENCODING
                        Encoding to use when writing text data to file

Example

Using srt-parse:

python3 srt-parse.py foo.mp3 foo.srt

Will produce in the following files in the output directory (by default .\out\)

0-audio.mp3
1-audio.mp3
2-audio.mp3
3-audio.mp3
...
out.csv

Each file is made per subtitle in the .srt file and out.csv groups each audio file to its transcript.

Notes

YouTube Subtitles will have a duration that matches two lines of text.

Ex.

1
00:00:03,830 --> 00:00:12,910 // this duration spans both written lines below
I'll say to you it's a hot wonderful night

3
00:00:10,840 --> 00:00:14,889
I want to thank the people who came up to me

About

Segments a .mp3 file into several smaller audio clips using an accompanying .srt closed captioning file.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%