Attach a video file and use natural language to describe what you want to do to it.
Important
This only works with Chrome Dev+ that has the built-in AI features enabled
How to enable built-in AI
- Install Chrome Dev: Ensure you have version 127. [Download Chrome Dev](https://google.com/chrome/dev/).
- Check that you’re on 127.0.6512.0 or above
- Enable two flags:
- chrome://flags/#optimization-guide-on-device-model - BypassPerfRequirement
- chrome://flags/#prompt-api-for-gemini-nano - Enabled
- Relaunch Chrome
- Navigate to chrome://components
- Check that Optimization Guide On Device Model is downloading or force download if not Might take a few minutes for this component to even appear
- Open dev tools and type
(await ai.languageModel.capabilities()).available
, should return "readily" when all good
npm install
npm run dev
Using the Prompt API we take in natural language queries and map that to the closest match in a Map
that holds the actual ffmpeg command e.g.
E.g "Turn into gif" -> Gemini Nano -> "Convert video to GIF"
Looking up that string it returns:
{
"Convert video to GIF": [
"-i",
"{{input}}",
"-vf",
"fps=10,scale=320:-1:flags=lanczos",
"-c:v",
"gif",
"-y",
"{{name}}.gif",
],
}
This query then interpolates the input and output that is stored when you attach a file to get the actual file names to pass into ffmpeg.wasm.
Right now this tool will do simple operations, you can see exactly what it can do the AI class.
Working on large video files isn't terribly fast (it's actually pretty slow), there is a multi threaded version of ffmpeg.wasm and it can also support WORKERFS mounting over the default MEMFS but I haven't explored that yet.