Home / Audio / Compare

Audio Pricing Comparison

20 providers compared by pricing model, free tiers, hosting options, and headquarters. Last updated July 2026.

18 with free tiers · 3 open source · 3 self-hostable · 4 European

Provider	Pricing Model	Starting Price	Free Tier	Hosting	Open Source	HQ
AssemblyAI	—	—	✓	—	—	🇺🇸 United States
Cartesia	—	—	✓	—	—	🇺🇸 United States
Cekura	—	—	✓	—	—	🇺🇸 United States
Deepgram	Pay-per-use	$0.0077/min	✓	Cloud + Self-hosted	—	🇺🇸 United States
Eleven Labs	Freemium	$5/mo	✓	Cloud	—	🇺🇸 United States
Fish Audio	—	—	✓	—	✓	🇺🇸 United States
Gladia	Pay-per-use	$0.61/hr	✓	Cloud	—	🇫🇷 France
Hume AI	—	—	✓	—	—	🇺🇸 United States
LMNT	—	—	✓	—	—	🇺🇸 United States
LemonFox	Subscription	$5/mo	✓	Cloud	—	🇩🇪 Germany
LiveKit Agents	—	—	✓	—	✓	—
MusicGPT	—	—	✓	—	—	—
OpenAI	Pay-per-use	$0.05/1M tokens	✓	Cloud	—	🇺🇸 United States
Resemble AI	Pay-per-use	~$0.01/sec	✓	Cloud + Self-hosted	—	🇺🇸 United States
Rime AI	Freemium	Free	✓	—	—	🇺🇸 United States
Samtal	—	—	—	Cloud	—	🇸🇪 Sweden
SpeechifyAI	—	$10/mo (1M chars included)	✓	Cloud	—	🇺🇸 United States
Speechmatics	Freemium	$0.24/hr	✓	Cloud + Self-hosted	—	🇬🇧 United Kingdom
Suno	—	—	—	—	—	🇺🇸 United States
VoxCPM	—	—	✓	—	✓	—

ℹ️ Pricing units vary by provider type: per-token for LLM APIs, per-GPU-hour for compute platforms, per-request for media generation. Verify current rates on each provider's website.

Providers with free tiers

These audio providers offer free credits, free tiers, or open-source self-hosting options to get started without upfront costs.

AssemblyAI

Speech-to-text APIs with audio intelligence, speaker diarization, and real-ti...

Cartesia

Real-time voice AI with ultra-low latency text-to-speech and voice cloning in...

Cekura

Testing and monitoring platform for AI voice and chat agents

Deepgram

Build Voice AI into your apps.

From: $0.0077/min

Eleven Labs

Natural Text to Speech & AI Voice Generator.

From: $5/mo

Fish Audio

Open-source text-to-speech and voice cloning with low latency in 13+ languages

Show all 18 providers with free tiers

Gladia

Fast speech-to-text API with real-time transcription and speaker diarization

From: $0.61/hr

Hume AI

Empathic voice AI that detects and responds to human emotion in real-time

LMNT

Low-latency text-to-speech API built for real-time conversational AI

LemonFox

Affordable speech-to-text and text-to-speech API with 100+ language support

From: $5/mo

LiveKit Agents

Open-source framework for building real-time voice and multimodal AI agents o...

MusicGPT

AI audio API for generating songs, speech, and sound, with stem splitting, vo...

OpenAI

API access to GPT, o-series reasoning, DALL-E, and Whisper models

From: $0.05/1M tokens

Resemble AI

Generative Voice AI built for Enterprise.

From: ~$0.01/sec

Rime AI

Text-to-speech API with 200+ voices, sub-200ms latency, and on-premise deploy...

From: Free

SpeechifyAI

Text-to-speech API with low-latency streaming, voice cloning, and 30+ locales

From: $10/mo (1M chars included)

Speechmatics

Enterprise speech-to-text API supporting 55+ languages with high accuracy

From: $0.24/hr

VoxCPM

Tokenizer-free open-source text-to-speech with voice cloning across 30 languages

Frequently asked questions

Which audio offer a free tier?

18 of the 20 audio listed offer a free tier or free credits. Examples: AssemblyAI, Cartesia, Cekura, Deepgram, Eleven Labs. Use the "Free tier" filter above to see the full list.

Which audio are open source?

3 open-source options are listed. Examples: Fish Audio, LiveKit Agents, VoxCPM. Most can be self-hosted alongside or instead of any managed offering.

Are there European audio?

Yes. 4 providers in this category are headquartered in Europe, including Gladia, LemonFox, Samtal, Speechmatics. The European providers page has the full cross-category list with hosting regions.

Which audio can be self-hosted?

3 of the 20 listed support self-hosting, either as the primary deployment model or alongside a managed cloud offering. Examples: Deepgram, Resemble AI , Speechmatics.

How to choose an inference API provider

The right provider depends on workload type, latency requirements, and budget. Most providers use pay-per-token pricing for LLMs and per-second GPU billing for custom models. Token-based pricing varies by model, so the cheapest provider for one model may not be cheapest for another.

Free tiers are useful for prototyping but often come with rate limits. For production, compare per-token costs for your specific model, cold start latency, rate limits, and whether the provider supports the models you need.

Teams with data residency requirements should check hosting options and provider headquarters. European providers like Gladia, LemonFox, Samtal keep data within EU jurisdiction. See the full European AI Infrastructure directory. Self-hostable options like Deepgram and Resemble AI give full control over data location.

For a deeper analysis, read AI Inference API Providers Compared on the blog. Pricing changes frequently, so verify current rates on each provider's website. Submit a correction.

See how these tools fit into a full stack

🎙️ Voice AI Stack

Browse all Audio tools or explore the full AI Infrastructure Landscape.

Is your product missing?

Add it here →