Thumbnail/screenshots with transcoding job #103

nipakke · 2024-11-02T03:20:26Z

A cool feature would be the ability to make thumbnails/screenshots in transcoding jobs.
With fluent-ffmpeg it's easy as calling the "screenshots" function on the fluent-ffmpeg command, more info in the docs

Suggested input format for configuring screenshots in a transcode job:

{
	// the usual transcode job data
	"screenshots": [
		{
			"timestamps": [30.5, "50%", "01:10.123"],
			"size": "?x1080" // any input that is good for fluent-ffmpeg, it can be found in the docs
		},
		{
			"timestamps": ["50%"]
			// No size means the same size as the input video
		},
		//other way like with fluent-ffmpeg
		{
			// Will take screens at 20%, 40%, 60% and 80% of the video
			"count": 4,
			"size": "1280x720"
		}
	]
}

This example would generate 8 screenshots with different sizes taken at different times in the video.

Some things to consider:

It needs a different fluentFfmpeg call because ffmpeg can't make screenshots and videos together while transcoding as far as I understand
For varying screenshot sizes multiple calls are likely needed
It won't work with input streams (I don't know if it's a problem, don't think so if the original input is downloaded anyway)

matvp91 · 2024-11-02T07:55:17Z

I've started an implementation that looks like your proposal a couple months ago, early on in the project, but I decided to drop it for various reasons I'll mention below. The goal is still to have thumbnails available in players. This could be a bit lengthy :)

A fundamental of Superstreamer is to rely on specification as much as possible, HLS in particular (I wrote some more info here: https://superstreamer.xyz/guide/what-is-superstreamer.html#core-standards).
There's no real spec around sprite generation for thumbnails, everyone does it in its own way (eg; jwplayer relies on WEBVTT - https://docs.jwplayer.com/players/docs/android-add-preview-thumbnails#preview-thumbnail-overview, YouTube ships a bunch of JSON with thumbnail references, AVPlayer relies on I-Frame only playlists, ...). We would introduce another way.
Since we're into HLS CMAF only, I'd like to explore the idea of I-Frame only playlists. I think they're perfectly fit, let me elaborate:
- During transcode, we take framerate + segment size into account to insert full frames at the start of a segment. Eg; if segmentSize is 4, at 25/fps, we'd insert a full frame (i-frame) followed with 99 partial frames (b/p-frames).
- During package, we create an i-frame only playlist that tells players "there's a full frame available at x byterange".
- We never mux video and audio in the same container, so grabbing a frame means there's no audio frames being sent over the wire for no reason.

#EXTM3U
#EXT-X-VERSION:6
#EXT-X-TARGETDURATION:4
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-I-FRAMES-ONLY
#EXT-X-MAP:URI="init.mp4"
#EXTINF:3.440,
#EXT-X-BYTERANGE:26666@84
1.m4s
#EXTINF:2.240,
#EXT-X-BYTERANGE:93739@84
2.m4s
#EXTINF:1.320,
#EXT-X-BYTERANGE:19291@84
3.m4s
#EXT-X-ENDLIST

The person behind HLS.js told me they have I-Frame only playlist support planned in 2025, and since our player is basically an extended version of HLS.js, this'll fit right in.

Let me know what you think about this.

Edit: on second thought, having a couple of frames extracted sounds like a good thing for use cases beyond player thumbnails, such as poster images, ...

nipakke · 2024-11-02T19:10:50Z

Thank you for your detailed response!
Exactly the potential use case is thumbnails where there is no player, only a thumbnail image is shown like on a youtube channel page, multiple videos on a grid with their thumbnails.

Also to easily identify the output files the input could have a key or id field that is returned by the output data. An example can be found here. The key field is always returned unchanged, this might require a different input format where screenshots are defined one by one and we can give each one a different unique key.
This unique key could be also applied to the output video streams for easy identification but I think I will open another issue with this proposal.

matvp91 · 2024-11-03T13:19:44Z

Yes, sounds good. I'd see this as a separate job that pushes thumbnails on S3, the same way segments are pushed with the transcode job.

Today the transcode job orchestrates a package job when the packageAfter flag is set to true, but with yet another job being orchestrated by transcode I doubt this fits in. The bigger plan is to construct a media pipeline "sort of" job that orchestrates what happens underneath (that work is planned soonish). I'll pick up your feature when the latter is done.

matvp91 · 2024-11-16T22:18:23Z

@nipakke letting you know that I started the ground work for thumbnail generation in #116. This should land shortly and will be integrated in the newly pipeline API as well (that functions as an orchestrator for transcode, package, thumbnail generation, ...).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thumbnail/screenshots with transcoding job #103

Thumbnail/screenshots with transcoding job #103

nipakke commented Nov 2, 2024

matvp91 commented Nov 2, 2024 •

edited

Loading

nipakke commented Nov 2, 2024

matvp91 commented Nov 3, 2024 •

edited

Loading

matvp91 commented Nov 16, 2024

Thumbnail/screenshots with transcoding job #103

Thumbnail/screenshots with transcoding job #103

Comments

nipakke commented Nov 2, 2024

matvp91 commented Nov 2, 2024 • edited Loading

nipakke commented Nov 2, 2024

matvp91 commented Nov 3, 2024 • edited Loading

matvp91 commented Nov 16, 2024

matvp91 commented Nov 2, 2024 •

edited

Loading

matvp91 commented Nov 3, 2024 •

edited

Loading