import OpenAI from 'openai'
const HICAP_API_KEY = process.env.HICAP_API_KEY
const client = new OpenAI({
baseURL: 'https://api.hicap.ai/v2',
apiKey: HICAP_API_KEY,
defaultHeaders: { "api-key": HICAP_API_KEY },
})
const response = await client.chat.completions.create({
model: 'gpt-5.2',
messages: [
{ role: 'user', content: 'Hello, GPT!' }
],
// Same API. 25% less cost.
})Trusted by teams at





Daily coding with AI assistance, refactoring, and debugging
Full-time AI pair programming across multiple projects
Team of 5 developers using mixed models for client work
Call the standard Chat Completions endpoint and set model to whatever you want to run.
curl https://api.hicap.ai/v2/openai/chat/completions \
-H "Content-Type: application/json" \
-H "api-key: $HICAP_API_KEY" \
-d '{"model":"gpt-5","messages":[{"role":"user","content":"Hello"}]}'We buy reserved GPU capacity in bulk, then let you tap into it on-demand.
You get the speed of provisioned throughput at a fraction of the cost.
Access the same models at a fraction of pay-as-you-go pricing through reserved GPU capacity.
Provisioned throughput means your requests skip the queue. No cold starts, no throttling.
GPT-5.2, Claude 4 Sonnet, Gemini 3.0 Flash—switch between providers with one line of code.
Track token usage, costs, and latency across all models. See exactly where your AI budget goes.
Works with existing OpenAI, Anthropic, and Google SDKs. Just change the base URL.
99.9% uptime SLA. Your requests are load-balanced across multiple providers for redundancy.