Fish Audio
Open-source text-to-speech and voice cloning with low latency in 13+ languages
Fish Audio provides AI-powered text-to-speech and voice cloning. Its FishAudio S1 model, ranked #1 on TTS-Arena2, generates natural, emotionally rich speech from as little as 10-30 seconds of reference audio. Supports 13+ languages with under 150ms latency. The core Fish Speech model is open source with 25k+ GitHub stars. API pricing is pay-as-you-go at approximately $15 per 1M UTF-8 bytes. Free tier includes 8,000 credits per month for personal use.
Pricing: Free / monthly subscriptions
Fish Audio Alternatives
Explore 17 products in the Audio category. View all Fish Audio alternatives.
Eleven Labs
Natural Text to Speech & AI Voice Generator.
LemonFox
Affordable speech-to-text and text-to-speech API with 100+ language support
OpenAI
API access to GPT, o-series reasoning, DALL-E, and Whisper models
Is your product missing?