Mistralai/Voxtral-Mini-3B-2507 · Hugging Face
10 months ago
- #Multilingual
- #AI
- #Speech Recognition
- Voxtral Mini 1.0 (3B) is an enhanced version of Ministral 3B with advanced audio input capabilities.
- It excels in speech transcription, translation, and audio understanding.
- Key features include dedicated transcription mode, long-form context (32k tokens), built-in Q&A and summarization, multilingual support, and function-calling from voice.
- Benchmark results show its performance in audio and text tasks.
- Usage is supported with frameworks like vLLM, with specific installation and setup instructions.
- Examples provided for audio instruct and transcription capabilities using Python snippets.
- Voxtral-Mini-3B-2507 requires ~9.5 GB of GPU RAM for operation.