Skip to content

ryanseddon/FFprompt

Repository files navigation

FFprompt

Attach a video file and use natural language to describe what you want to do to it.

Important

This only works with Chrome Dev+ that has the built-in AI features enabled

YouTube

How to enable built-in AI
  1. Install Chrome Dev: Ensure you have version 127. [Download Chrome Dev](https://google.com/chrome/dev/).
  2. Check that you’re on 127.0.6512.0 or above
  3. Enable two flags:
    • chrome://flags/#optimization-guide-on-device-model - BypassPerfRequirement
    • chrome://flags/#prompt-api-for-gemini-nano - Enabled
  4. Relaunch Chrome
  5. Navigate to chrome://components
  6. Check that Optimization Guide On Device Model is downloading or force download if not Might take a few minutes for this component to even appear
  7. Open dev tools and type (await ai.languageModel.capabilities()).available, should return "readily" when all good

Screenshot of FFprompt UI

Run locally

npm install
npm run dev

How does it work

Using the Prompt API we take in natural language queries and map that to the closest match in a Map that holds the actual ffmpeg command e.g.

E.g "Turn into gif" -> Gemini Nano -> "Convert video to GIF"

Looking up that string it returns:

{
  "Convert video to GIF": [
    "-i",
    "{{input}}",
    "-vf",
    "fps=10,scale=320:-1:flags=lanczos",
    "-c:v",
    "gif",
    "-y",
    "{{name}}.gif",
  ],
}

This query then interpolates the input and output that is stored when you attach a file to get the actual file names to pass into ffmpeg.wasm.

Limitations

Right now this tool will do simple operations, you can see exactly what it can do the AI class.

Working on large video files isn't terribly fast (it's actually pretty slow), there is a multi threaded version of ffmpeg.wasm and it can also support WORKERFS mounting over the default MEMFS but I haven't explored that yet.