fix: Solve CUDA AV1 decoding #448

hugo-ijw · 2025-01-09T13:29:40Z

fixes #443
Hello there, love the library!
It looks like there is an FFmpeg bug when selecting an AV1 decoder while specifying an HW decoder; a SW decoder is always chosen (cf. FFmpeg/FFmpeg@ad67ea9).

I also added small details to the contributing doc and an AV1 sample video for tests (I do not know if the GPU used in your CI/CD is compatible with CUDA AV1 decoding, let's see).

facebook-github-bot · 2025-01-09T13:29:48Z

Hi @hugo-ijw!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

facebook-github-bot · 2025-01-09T14:09:12Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

scotts · 2025-01-09T18:09:38Z

src/torchcodec/decoders/_core/CudaDevice.cpp

@@ -256,4 +256,36 @@ void convertAVFrameToDecodedOutputOnCuda(
          << " took: " << duration.count() << "us" << std::endl;
 }

+// inspired by https://github.com/FFmpeg/FFmpeg/commit/ad67ea9
+void forceCudaCodec(


Let's change the semantics of this function so that it returns the codec, if found. Then the caller is responsible for doing the assignment. That would make this signature:

std::optional<AVCodecPtr> findCudaCodec( const torch::Device& decive, const AVCodecID& codecId);

Then, inside the function, when we find the right codec, we just return it. If we loop through all available codecs and never find it, we return std::nullopt.

scotts · 2025-01-09T18:11:29Z

src/torchcodec/decoders/_core/CudaDevice.cpp

+    const torch::Device& device,
+    AVCodecPtr* codec,
+    const AVCodecID& codecId) {
+  if (device.type() != torch::kCUDA) {


Replace this check with throwErrorIfNonCudaDevice(device). That's the convention used by the other functions in this file, and it also enforces that calling this function in a non-CUDA context is an error.

scotts · 2025-01-09T18:12:49Z

src/torchcodec/decoders/_core/CudaDevice.cpp

+    return;
+  }
+
+  const AVCodec* c;


Let's move this definition into the while loop condition itself. With the change in semantics, we no longer need to reference the AVCodec outside of a single loop iteration, so it's fine (and actually good) that its scope is limited to the while loop only.

scotts · 2025-01-09T18:15:33Z

src/torchcodec/decoders/_core/VideoDecoder.cpp

+
+  if (options.device.type() == torch::kCUDA) {
+    forceCudaCodec(
+        options.device, &codec, streamInfo.stream->codecpar->codec_id);


With the change in semantics, we can make this call:

codec = findCudaCodec(options.device, streamInfo.stream->codecpar->codec_id).value_or(codec);

scotts · 2025-01-09T18:18:05Z

test/decoders/test_video_decoder.py

+        # We don't parametrize with CUDA because the current GPUs on CI do not
+        # support AV1:
+        decoder = VideoDecoder(AV1_VIDEO.path, device="cpu")
+        ref_frame11 = AV1_VIDEO.get_frame_data_by_index(10)


The convention we're following in these tests is that the variable name number matches the index number, so this should be ref_frame10, ref_frame_info10 and decoded_frame10.

src/torchcodec/decoders/_core/CudaDevice.cpp

scotts · 2025-01-09T19:02:22Z

@hugo-ijw, thank you for digging into this and fixing it! I made a bunch of minor comments on code structure and style. But my main concern right now is it appears that we don't have AV1 support on our GPU CPUs in CI, based on the comment you made. Can you re-enable that in your code so we can dig into what exactly is failing? Since CUDA support for AV1 is the main purpose of this PR, I want to spend some time seeing if we can actually test that scenario.

Unfortunately, it also looks like the CUDA tests may be failing because of some outdated actions. I'll dig into that myself.

scotts · 2025-01-09T19:48:32Z

PR #450 should solve the CUDA test problems.

hugo-ijw · 2025-01-10T08:52:26Z

@hugo-ijw, thank you for digging into this and fixing it! I made a bunch of minor comments on code structure and style. But my main concern right now is it appears that we don't have AV1 support on our GPU CPUs in CI, based on the comment you made. Can you re-enable that in your code so we can dig into what exactly is failing? Since CUDA support for AV1 is the main purpose of this PR, I want to spend some time seeing if we can actually test that scenario.

Unfortunately, it also looks like the CUDA tests may be failing because of some outdated actions. I'll dig into that myself.

Thanks for the comments, I'll try to work on them this morning.
Regarding the GPU tests, I believe it's the same case as h265 (#350), the GPU used in the CI/CD does not support AV1 decoding:
https://github.com/pytorch/torchcodec/actions/runs/12690884678/job/35383538150

scotts · 2025-01-10T17:06:48Z

@hugo-ijw, I dug into the problem with our runners a little bit in #451. That is, I tried enabling the H265 test on a different kind of runner, and notably, it gets the same error message you're fixing: https://github.com/pytorch/torchcodec/actions/runs/12713611963/job/35442310631?pr=451#step:16:364. But it doesn't get the error message about the decoder not being available. I'm going to push to your branch to see if that fixes the problem.

scotts · 2025-01-10T20:42:58Z

src/torchcodec/decoders/_core/CudaDevice.cpp

+// we have to do this because of an FFmpeg bug where hardware decoding is not
+// appropriately set, so we just go off and find the matching codec for the CUDA
+// device
+std::optional<AVCodecPtr> forceCudaCodec(


Nit: I think findCudaCodec is a more appropriate name now.

scotts · 2025-01-10T20:43:29Z

src/torchcodec/decoders/_core/CudaDevice.cpp

+    const AVCodecID& codecId) {
+  throwErrorIfNonCudaDevice(device);
+
+  AVCodecPtr c;


Nit: we can move this declaration into the while loop condition.

scotts · 2025-01-10T20:49:08Z

@hugo-ijw, good news! By changing the kinds of runners we use, I was able to get your new test to run on GPUs in our CI. Things look great now; the FFmpeg version 5 failures a known problem. I'm okay merging as-is, but if you made the small style changes I'd greatly appreciate it.

scotts · 2025-01-13T11:30:06Z

Uh-oh, looks like I was wrong about the while loop declaration! I'll push a fix so we can merge.

fix: Solve CUDA av1 decoding

d92149c

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 9, 2025

hugo-ijw added 2 commits January 9, 2025 16:51

test: Remove av1 cuda test

b4825a7

fix: Use AVCodecPtr in forceCudaCodec

7e6bc92

scotts reviewed Jan 9, 2025

View reviewed changes

src/torchcodec/decoders/_core/CudaDevice.cpp Show resolved Hide resolved

hugo-ijw added 2 commits January 10, 2025 10:37

fix: Change findCudaCodec signature

a29287c

test: Follow frame index convention in AV1 test

5e82e48

scotts added 4 commits January 10, 2025 10:07

Merge branch 'main' of github.com:pytorch/torchcodec into fix/cuda-av1

70c1985

Make CUDA Linux test runner linux.g5.4xlarge.nvidia.gpu

a9fa4bb

Re-enable AV1 test on CUDA

d49fde5

Make sure device is general in test

25437b0

scotts reviewed Jan 10, 2025

View reviewed changes

scotts approved these changes Jan 10, 2025

View reviewed changes

hugo-ijw added 2 commits January 13, 2025 08:55

fix: Move AVCodecPtr declaration to while loop in forceCudaCodec

c64b833

fix: Rename forceCudaCodec to findCudaCodec

9a9b50f

Undo my incorrect suggestion

e453891

scotts merged commit d8dde5c into pytorch:main Jan 13, 2025
48 of 51 checks passed

scotts mentioned this pull request Jan 15, 2025

Enable h265 test on CUDA #451

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Solve CUDA AV1 decoding #448

fix: Solve CUDA AV1 decoding #448

hugo-ijw commented Jan 9, 2025 •

edited

Loading

facebook-github-bot commented Jan 9, 2025

facebook-github-bot commented Jan 9, 2025

scotts Jan 9, 2025

scotts Jan 9, 2025

scotts Jan 9, 2025

scotts Jan 9, 2025

scotts Jan 9, 2025

scotts commented Jan 9, 2025

scotts commented Jan 9, 2025

hugo-ijw commented Jan 10, 2025

scotts commented Jan 10, 2025 •

edited

Loading

scotts Jan 10, 2025

scotts Jan 10, 2025

scotts commented Jan 10, 2025

scotts commented Jan 13, 2025

fix: Solve CUDA AV1 decoding #448

fix: Solve CUDA AV1 decoding #448

Conversation

hugo-ijw commented Jan 9, 2025 • edited Loading

facebook-github-bot commented Jan 9, 2025

Action Required

Process

facebook-github-bot commented Jan 9, 2025

scotts Jan 9, 2025

Choose a reason for hiding this comment

scotts Jan 9, 2025

Choose a reason for hiding this comment

scotts Jan 9, 2025

Choose a reason for hiding this comment

scotts Jan 9, 2025

Choose a reason for hiding this comment

scotts Jan 9, 2025

Choose a reason for hiding this comment

scotts commented Jan 9, 2025

scotts commented Jan 9, 2025

hugo-ijw commented Jan 10, 2025

scotts commented Jan 10, 2025 • edited Loading

scotts Jan 10, 2025

Choose a reason for hiding this comment

scotts Jan 10, 2025

Choose a reason for hiding this comment

scotts commented Jan 10, 2025

scotts commented Jan 13, 2025

hugo-ijw commented Jan 9, 2025 •

edited

Loading

scotts commented Jan 10, 2025 •

edited

Loading