Drop-in compatible with OpenAI SDK
import OpenAI from "openai"
const openai = new OpenAI({
baseURL: "https://api.hicap.ai/v1",
defaultHeaders: {
"api-key": process.env.HICAP_API_KEY
}
})
const response = await openai.chat.completions.create({
model: "gpt-5.4",
messages: [{ role: "user", content: "Hello!" }]
})No SDK to install, no code to rewrite. If your tool supports OpenAI Compatible Endpoints, it already supports Hicap.
Sign up for free and grab your API key from the dashboard.
Point your OpenAI SDK, CLI tool, or extension to api.hicap.ai/v1.
Every request is routed through reserved capacity — same models, lower cost.
Works with
We pool reserved GPU capacity across multiple cloud providers into a unified inference grid.
You get the speed of provisioned throughput with built-in redundancy and cost savings.
A unified network spanning OpenAI, Anthropic, and Google—route requests to reserved capacity across multiple providers from a single API. See how it works →
Save up to 25% vs pay-as-you-go pricing through bulk reserved GPU capacity.
Provisioned throughput delivers consistent performance for your workloads. No cold starts, no throttling.
Use the latest models—GPT-5.5, Claude Opus 4.7, Gemini 3.1 Pro and more. View catalog →
Run ElevenLabs TTS and STT through the same API. No separate integration or billing path. Explore voice →
Track token usage, costs, and latency across all models. See exactly where your AI budget goes.
Works with curl, OpenAI SDK, or any OpenAI-compatible tool. Just change the base URL.
Your requests are load-balanced across multiple providers for redundancy and high availability.
Get dedicated GPU throughput, volume discounts, and priority support for your team. We'll tailor a plan to your usage.