Neutts-air – open-source, on device TTS
4 days ago
- #TTS
- #VoiceCloning
- #OnDeviceAI
- NeuTTS Air is the world’s first super-realistic, on-device TTS speech language model with instant voice cloning.
- Built off a 0.5B LLM backbone, it offers natural-sounding speech, real-time performance, and built-in security.
- Features include best-in-class realism, on-device deployment optimization, and instant voice cloning with just 3 seconds of audio.
- Supported languages include English, with a neural audio codec (NeuCodec) for high-quality audio at low bitrates.
- Available in GGML format for efficient on-device inference, with real-time generation on mid-range devices.
- Installation requires espeak and Python dependencies, with optional support for GGUF models and ONNX decoder.
- NeuTTS Air requires a reference audio sample and text string to synthesize speech in the style of the reference.
- Optimal performance tips include using GGUF model backbones, pre-encoding references, and using the ONNX codec decoder.
- Generated audio files include a perceptual watermark for security and ethical use.