Hasty Briefsbeta

Bilingual

GitHub - jamiepine/voicebox: The open-source voice synthesis studio

3 hours ago
  • #local-first
  • #voice-synthesis
  • #open-source
  • Voicebox is an open-source, local-first voice cloning studio that operates entirely on your machine, offering an alternative to services like ElevenLabs.
  • It supports cloning voices from short audio samples, generating speech in 23 languages using 5 TTS engines, and applying post-processing audio effects.
  • Features include complete privacy (data stays local), expressive speech tags (e.g., [laugh], [sigh]), unlimited text length with auto-chunking, a multi-voice timeline editor for stories, and a REST API for integration.
  • Available for macOS, Windows, Linux (with Docker), and supports various hardware backends (MLX, CUDA, ROCm, DirectML, CPU).
  • Includes 8 audio effects (pitch shift, reverb, delay, etc.), generation versioning with provenance tracking, non-blocking generation queues, voice profile management, and in-app recording/transcription.
  • Built with Tauri (Rust), React, FastAPI, and uses models like Qwen3-TTS, LuxTTS, Chatterbox, and TADA.
  • Future plans include real-time streaming, voice design from text, more models, plugin architecture, and a mobile companion app.
  • Open for contributions via GitHub with detailed setup instructions (using 'just' for commands) and an MIT license.