KaniTTS

by KaniTTS Project / community (uses LiquidAI LFM2 backbone + NanoCodec)

Ultra-fast, expressive open-source TTS model optimized for edge and real-time use

See KaniTTS repo and docs

Summary

KaniTTS is an open-source text-to-speech system that combines an LFM2-based token generator with an efficient NanoCodec vocoder to deliver low-latency, high-quality speech. It targets real-time applications and edge deployments.

Features

  • Two-stage architecture: LFM2-350M backbone for token generation + NanoCodec vocoder
  • 22kHz audio output with ~0.6kbps compression
  • Low resource footprint (≈2GB VRAM) and fast runtime (≈1s for 15s audio in community reports)
  • Multilingual support (English + Arabic, Chinese, German, Korean, Spanish)
  • Apache 2.0 license for broad use

Superpowers

Real-time, production-grade TTS on consumer hardware; useful for voice assistants, live content, gaming, and agentic audio use cases.

Known limitations & notes

  • New project with active improvements; benchmarks vary across forks and hardware
  • Quality varies with dataset and fine-tuning for specific voices

Sources / notes:

  • Community repositories and project docs.