Icon for VoxCPM

VoxCPM

Open Source Free Trial

Tokenizer-free open-source text-to-speech with voice cloning across 30 languages

VoxCPM is an open-source text-to-speech model from OpenBMB that generates speech directly from continuous representations, skipping the discrete token step most TTS systems use. It supports 30 languages without language tags, voice cloning from a short reference clip, and voice design from natural-language descriptions.

The model is a 2B-parameter diffusion autoregressive architecture that produces 48kHz audio and runs in real time (RTF around 0.3 on an RTX 4090, needing roughly 8GB VRAM). It is released under Apache 2.0, so it can be self-hosted for personal or commercial use without per-character fees. A hosted browser playground is also available for trying it without local setup.

Pricing: Free (open source); hosted playground uses one-time credit packs

Screenshot of VoxCPM webpage

Is your product missing?

Add it here →