KaniTTS
by KaniTTS Project / community (uses LiquidAI LFM2 backbone + NanoCodec)
Ultra-fast, expressive open-source TTS model optimized for edge and real-time use
See KaniTTS repo and docs
Summary
KaniTTS is an open-source text-to-speech system that combines an LFM2-based token generator with an efficient NanoCodec vocoder to deliver low-latency, high-quality speech. It targets real-time applications and edge deployments.
Features
- Two-stage architecture: LFM2-350M backbone for token generation + NanoCodec vocoder
- 22kHz audio output with ~0.6kbps compression
- Low resource footprint (≈2GB VRAM) and fast runtime (≈1s for 15s audio in community reports)
- Multilingual support (English + Arabic, Chinese, German, Korean, Spanish)
- Apache 2.0 license for broad use
Superpowers
Real-time, production-grade TTS on consumer hardware; useful for voice assistants, live content, gaming, and agentic audio use cases.
Known limitations & notes
- New project with active improvements; benchmarks vary across forks and hardware
- Quality varies with dataset and fine-tuning for specific voices
Sources / notes:
- Community repositories and project docs.