FFmpeg 8.0 adds Whisper support

11 days ago

Copy Link

A new audio filter named 'whisper' is introduced for running audio transcriptions using the Whisper model.
The filter requires the whisper.cpp library and can be enabled with './configure --enable-whisper'.
Key options include model path, language (default 'auto'), queue size (default 3s), GPU usage (default true), and output format (text, srt, json).
VAD (Voice Activity Detection) model support is included for better transcription quality with parameters like threshold, min speech/silence durations.
Examples provided for transcription with SRT file generation, JSON output to HTTP service, and microphone input with VAD.

Hasty Briefsbeta