Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crf-search using worst sample only #202

Open
alexheretic opened this issue May 10, 2024 · 7 comments
Open

crf-search using worst sample only #202

alexheretic opened this issue May 10, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@alexheretic
Copy link
Owner

Add a mode where we use the worst sample's VMAF after sample-encode instead of an average across all samples.

If we assumed that the worst quality sample VMAF score at a given crf would also be the worst sample for another crf we could speed up a crf-search by only doing N samples once then trying only the worst sample at different crf values.

Perhaps a --sample-aggregate=worst arg for sample-encode/crf-search/auto-encode.

@alexheretic alexheretic added the enhancement New feature or request label May 10, 2024
@allrobot
Copy link

Good suggest. You can try PR or waiting for the author to add this feature..

@veryprogram
Copy link

Even better would be an auto-encode mode where the CRF is recalculated on either a frame-by-frame or scene-by-scene basis (default would be scene-by-scene but with an option for frame-by-frame) so that no part of the video drops below VMAF 95, but also no part of the video is allocated more bitrate than absolutely necessary to achieve VMAF 95.

But I'm not sure whether the resulting video chunks could then be concatenated without a second transcode. My knowledge doesn't go that far. 🤔

@alexheretic
Copy link
Owner Author

What you describe sounds more like what Av1an does. So check out that project.

ab-av1 is aimed at being fast by sampling to find a suitable crf and then purely relying on the encoder to provide consistent quality. This issue is about an option to make the sampling "crf-search" even faster at the cost of a more pessimistic estimate of the full resultant VMAF.

@WhitePeter
Copy link
Contributor

Just passing by while evaluating ab-av1, which is an interesting (accidental) find, BTW. Thanks a lot for the effort!

TL;DR

I would like to offer my two cents and suggest a different approach than the OP's. While it is correct not to bother checking any samples other than the worst from a previous iteration, I think one could use the opportunity to reshuffle the deck: keep that worst sample and (randomly) find replacements for the others. It is not unlikely that there are even worse ones!

Reasoning

Seeing how VMAF came about (10s clips), I think one should take the threshold as the lower bound for anywhere in the video. This means that averaging over multiple samples should not be done. I would not want a final encode with that one scene which is 10 or more points below the set threshold just because there were enough samples with metrics far enough above it. That might just be the fly in the ointment which spoils my video experience; I would rather waste bits elsewhere. FWIW, my initial tests suggest that the final encode tends to be below the desired threshold by a significant margin, but, as I said, my stint so far has been very brief.

Anyway, since I want to find the lower bound it would be nice if ab-av1 would hunt more aggressively for more bad samples in following iterations.

@alexheretic
Copy link
Owner Author

The current sampling is trying to predict the avg overall VMAF, so it is appropriate to take samples and average them. This should be close to the final resultant VMAF and closer as you take more samples. So if you believe your input to be quite volatile you should take more samples. I don't think it is possible to tell automatically without computationally expensive analysis.

You can already configure a full pass encoding pass during crf-search, though this is quite slow. You could also combine that with harmonic mean VMAF (we should perhaps also try to support that better with sample aggregation too).

So this feature is just an additional option more about speeding up the search with a pessimistic result. If you wanted to search for more & worse samples, perhaps you can configure a lower --sample-every setting and combine with this proposed --sample-aggregate=worst. The first crf would analyse more 20s samples and then only the worst would count and other crfs would only encode that single sample.

I don't clearly see a better general strategy than encoding evenly distanced samples though.

@WhitePeter
Copy link
Contributor

WhitePeter commented Jul 19, 2024

Yes and no. Speed would stay the same with my suggestion but the estimate would be more realistic since the approach is even more pessimistic: have not found the worst sample yet, so keep looking.

Maybe have a look at the issue I just opened. It contains some more reasoning about VMAF. In short: I think the idea that it can be used on an entire movie or TV show episode is flawed. The inherent averaging will hide the bad scenes where, locally, VMAF might be way lower (>10) than the overall average of the whole piece.

I am also looking for a suitable CRF but I want to avoid "killer scenes" where I might have to suffer bad compression artifacts. Those really spoil it for me. I want to upgrade my toolchain from x264, which basically uses one CRF fits all, to svt-av1-psy and came across ab-av1, thinking it might give me some more dynamic decisions. BTW, I have actually downgraded from x265 because it produced rare killer artifacts I just cannot tolerate. And I only realize when watching, after having spent that precious CPU time.

Anyway, I just stumbled in an saw an opportunity. If you disagree, that is perfectly fine with me. The original idea is a good one. I just thought that there is too much arbitration in selecting a sample that was chosen by a fixed interval that just happens to be the worst in the very small subset of the whole. A slight change in --sample-every might find the worst sample at a totally different location, at the other end even. But I won't press the issue any further. Consider these my last words on this matter. ;-)

But I maintain that averaging over multiple samples should not be done in any case.

@WhitePeter
Copy link
Contributor

I am terribly sorry, but I just realized that I replied to the wrong comment #202 (comment). Must have scrolled wrong. I do appreciate the actual reply to my comment very much and will digest your suggestions now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants