Voxtral

by Mistral AI

Open-weights text-to-speech model — multilingual, locally runnable ElevenLabs alternative

See https://huggingface.co/mistralai/Voxtral-4B-TTS-2603

Features

4B parameter TTS model — compact enough to run locally on consumer hardware
9+ language support — multilingual voice generation with natural-sounding output
Open weights — fully open; self-host for privacy and offline use
Voice agent compatible — designed for integration into voice agent pipelines
Hugging Face deployment — standard HF model format; straightforward local install

Superpowers

Voxtral is Mistral’s answer to ElevenLabs — a high-quality, multilingual TTS model released as open weights. The key differentiator over proprietary TTS (ElevenLabs, OpenAI TTS) is cost and privacy: run locally at zero per-character cost with no data leaving your infrastructure. At 4B parameters, it’s viable on a modern GPU without cloud rendering. Particularly valuable for developers building voice agents or content pipelines who want a production-quality voice layer without ongoing API costs or vendor lock-in.

Pricing

Open weights — free to self-host
API access through Mistral La Plateforme (check current pricing)

ThirdBrAIn.tech

Explorer

Voxtral

Voxtral

Features

Superpowers

Pricing

Filter Videos

Tags

Channels

Favorites

Table of Contents

Recent Updates

Cora

Integrated Frameworks for Operations

Every

AI Tooling

Robotics

Arcade.ai MCP Gateway

Mixtral 8x7B

Mistral Large 2

Mistral 7B

Codestral 22B

Backlinks