Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ffmpeg subtitle overlay #81

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

Mashinow
Copy link

This update fixes the ffmpeg functionality and lets you add subtitles to your videos right away. I tested it on Windows 10, but I can't guarantee it'll work on other systems.

Copy link

what-the-diff bot commented May 20, 2024

PR Summary

  • Improved File Handling in 'cli.py'
    The changes introduce a more robust way to handle filename without the extension. Originally, the base filename was grabbed but the methodology was not the most consistent. It now uses a specific function to ignore file extensions.

  • Streamline Video Stream Variable Naming
    We've improved code consistency by renaming the video stream variable from 'video' to 'stream'. All code lines that previously referred to 'video' now correctly refer to 'stream'.

  • Advanced Subtitles Addition to Video Stream
    Modifications have been made to the process of adding subtitles, making it easier and more efficient. By reshaping the code, it's now more direct and intuitive to add subtitles to the video stream.

  • Updated Video Saving Process
    The previous way of saving the subtitled video has been switched out. The combination of ffmpeg.output() and ffmpeg.run() now helps to save these videos with specific required options. This change enhances the efficiency of the video saving process.

@brokeDude2901
Copy link

brokeDude2901 commented Nov 6, 2024

working now in Windows 11
edit: forgot to pip uninstall auto-subtitle then pip install git+https://github.com/Mashinow/auto-subtitle.git@f91096d3ade2fb18f866ab7133e97558028e794c

@brokeDude2901
Copy link

brokeDude2901 commented Nov 6, 2024

edit: everything fine, working with audio (windows 11 default media player won't play audio but Edge does normally)

PS C:\Users\vieta> python -m auto_subtitle.cli "C:\Users\vieta\George Carlin: They own you [MuO-2jkXP8Q].webm"
C:\Users\vieta\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\whisper\__init__.py:150: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(fp, map_location=device)
Extracting audio from George Carlin: They own you [MuO-2jkXP8Q]...
Generating subtitles for George Carlin: They own you [MuO-2jkXP8Q]... This might take a while.
Detected language: English
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17037/17037 [00:30<00:00, 567.18frames/s]
Adding subtitles to George Carlin: They own you [MuO-2jkXP8Q].webm...
ffmpeg version 7.0.2-full_build-www.gyan.dev Copyright (c) 2000-2024 the FFmpeg developers
  built with gcc 13.2.0 (Rev5, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libxevd --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxeve --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
  libavutil      59.  8.100 / 59.  8.100
  libavcodec     61.  3.100 / 61.  3.100
  libavformat    61.  1.100 / 61.  1.100
  libavdevice    61.  1.100 / 61.  1.100
  libavfilter    10.  1.100 / 10.  1.100
  libswscale      8.  1.100 /  8.  1.100
  libswresample   5.  1.100 /  5.  1.100
  libpostproc    58.  1.100 / 58.  1.100
[Parsed_subtitles_0 @ 000002405a091380] libass API version: 0x1703000
[Parsed_subtitles_0 @ 000002405a091380] libass source: commit: 0.17.3-6-gc5bb87e2f5d6c18763b4614817c206a4f4d2332a
[Parsed_subtitles_0 @ 000002405a091380] Shaper: FriBidi 1.0.15 (SIMPLE) HarfBuzz-ng 9.0.0 (COMPLEX)
[Parsed_subtitles_0 @ 000002405a091380] Using font provider directwrite (with GDI)
Input #0, matroska,webm, from 'C:\Users\vieta\George Carlin: They own you [MuO-2jkXP8Q].webm':
  Metadata:
    ENCODER         : Lavf61.1.100
  Duration: 00:02:50.41, start: 0.000000, bitrate: 586 kb/s
  Stream #0:0(eng): Video: vp9 (Profile 0), yuv420p(tv, bt709), 1280x720, SAR 1:1 DAR 16:9, 30 fps, 30 tbr, 1k tbn (default)
      Metadata:
        DURATION        : 00:02:50.366000000
  Stream #0:1(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
      Metadata:
        DURATION        : 00:02:50.408000000
Stream mapping:
  Stream #0:0 (vp9) -> subtitles:default
  Stream #0:1 -> #0:0 (copy)
  subtitles:default -> Stream #0:1 (libx264)
Press [q] to stop, [?] for help
[Parsed_subtitles_0 @ 00000240588a6180] libass API version: 0x1703000
[Parsed_subtitles_0 @ 00000240588a6180] libass source: commit: 0.17.3-6-gc5bb87e2f5d6c18763b4614817c206a4f4d2332a
[Parsed_subtitles_0 @ 00000240588a6180] Shaper: FriBidi 1.0.15 (SIMPLE) HarfBuzz-ng 9.0.0 (COMPLEX)
[Parsed_subtitles_0 @ 00000240588a6180] Using font provider directwrite (with GDI)
[Parsed_subtitles_0 @ 00000240588a6180] fontselect: (Arial, 400, 0) -> ArialMT, 0, ArialMT
[libx264 @ 000002405a115800] using SAR=1/1
[libx264 @ 000002405a115800] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 000002405a115800] profile High, level 3.1, 4:2:0, 8-bit
[libx264 @ 000002405a115800] 264 - core 164 r3191 4613ac3 - H.264/MPEG-4 AVC codec - Copyleft 2003-2024 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
[mp4 @ 000002405a723c80] track 0: codec frame size is not set
Output #0, mp4, to '.\George Carlin: They own you [MuO-2jkXP8Q].mp4':
  Metadata:
    encoder         : Lavf61.1.100
  Stream #0:0(eng): Audio: opus (Opus / 0x7375704F), 48000 Hz, stereo, fltp (default)
      Metadata:
        DURATION        : 00:02:50.408000000
  Stream #0:1: Video: h264 (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1280x720 [SAR 1:1 DAR 16:9], q=2-31, 30 fps, 15360 tbn
      Metadata:
        encoder         : Lavc61.3.100 libx264
      Side data:
        cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
[out#0/mp4 @ 000002405a723b80] video:17907KiB audio:2624KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 0.846858%
frame= 5111 fps=205 q=-1.0 Lsize=   20704KiB time=00:02:50.30 bitrate= 995.9kbits/s speed=6.82x
[libx264 @ 000002405a115800] frame I:22    Avg QP:17.15  size: 31675
[libx264 @ 000002405a115800] frame P:1334  Avg QP:20.64  size:  8391
[libx264 @ 000002405a115800] frame B:3755  Avg QP:22.82  size:  1716
[libx264 @ 000002405a115800] consecutive B-frames:  1.1%  1.4%  4.0% 93.4%
[libx264 @ 000002405a115800] mb I  I16..4: 14.7% 73.9% 11.4%
[libx264 @ 000002405a115800] mb P  I16..4:  3.8% 11.6%  0.5%  P16..4: 36.0%  7.3%  2.4%  0.0%  0.0%    skip:38.4%
[libx264 @ 000002405a115800] mb B  I16..4:  0.2%  0.3%  0.0%  B16..8: 29.5%  1.0%  0.1%  direct: 0.6%  skip:68.3%  L0:50.8% L1:48.1% BI: 1.1%
[libx264 @ 000002405a115800] 8x8 transform intra:71.9% inter:90.7%
[libx264 @ 000002405a115800] coded y,uvDC,uvAC intra: 22.7% 35.3% 9.0% inter: 3.6% 6.1% 0.0%
[libx264 @ 000002405a115800] i16 v,h,dc,p: 43% 34% 10% 13%
[libx264 @ 000002405a115800] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 29% 18% 39%  2%  3%  3%  3%  3%  2%
[libx264 @ 000002405a115800] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 35% 27% 14%  3%  5%  5%  5%  4%  3%
[libx264 @ 000002405a115800] i8c dc,h,v,p: 56% 21% 19%  4%
[libx264 @ 000002405a115800] Weighted P-Frames: Y:0.2% UV:0.1%
[libx264 @ 000002405a115800] ref P L0: 65.5% 10.3% 19.1%  5.1%  0.0%
[libx264 @ 000002405a115800] ref B L0: 91.3%  7.4%  1.3%
[libx264 @ 000002405a115800] ref B L1: 96.9%  3.1%
[libx264 @ 000002405a115800] kb/s:860.99
Saved subtitled video to C:\Users\vieta\George Carlin: They own you [MuO-2jkXP8Q].mp4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants