Geodd
Managed AI inference endpoints and GPU infrastructure with OpenAI-compatible API
Geodd is an AI inference platform offering serverless endpoints, dedicated inference, and GPU clusters for production workloads. The API is OpenAI SDK-compatible, so switching providers requires changing one line of code. Geodd applies inference optimizations at the model and runtime layers (custom CUDA kernels, disaggregated prefill/decode, KV cache routing, FP8/FP4 quantization) to increase throughput without hardware upgrades. The platform claims 25-50% more throughput on existing GPU fleets and 2-3x faster generation via adaptive speculative decoding. Primary region is North America East (500+ GPUs), with EU and APAC regions coming.
Pricing: Per token usage
Geodd Alternatives
Explore 76 products in the Inference APIs category. View all Geodd alternatives.
Lyceum
European GPU cloud for serverless inference, training, and on-demand GPU clusters
vLLM
High-throughput LLM inference engine with PagedAttention for efficient GPU memory usage
Work on Geodd? Feature it at the top of Inference APIs.
Is your product missing?