How Hicap Works

Same models you already use, routed through reserved GPU capacity so you pay less. Here's how it works.

STEP 01

Swap Your Base URL

Hicap is a drop-in replacement for the OpenAI API. Point your existing SDK, CLI tool, or extension to our endpoint and you're done.

  • No new SDK — works with the OpenAI client you already use
  • Access OpenAI, Anthropic, and Google models through one endpoint
  • Integrate in under five minutes with minimal code changes
Code Example
import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://api.hicap.ai/v1",
  defaultHeaders: {
    "api-key": process.env.HICAP_API_KEY
  }
});

const response = await openai.chat.completions.create({
  model: "gpt-5.4",
  messages: [{ role: "user", content: "Hello!" }]
});
STEP 02

We Route to Reserved GPUs

Your requests are intelligently routed to our reserved GPU capacity across multiple providers for optimal performance and cost.

  • Dedicated capacity means consistent, predictable performance
  • Automatic load balancing across providers
Request Flow
Your App
Hicap Gateway
Reserved
GPT-5.43,400 TPM
OVERFLOW112% utilized
Claude Opus 4.62,100 TPM
OVERFLOW108% utilized
Gemini 3.0 Flash5,000 TPM
OVERFLOW115% utilized
Dedicated to your app
On-Demand
GPT-5.4ACTIVE
Claude Opus 4.6ACTIVE
Gemini 3.0 FlashACTIVE
GPT-4.1STANDBY
Claude Sonnet 4.5STANDBY
Overflow & additional models
STEP 03

See Where Every Dollar Goes

Get full visibility into token usage, costs, and model performance across dev tools and production apps — all in one dashboard.

  • Track usage by model, application, and team
  • Compare dev tooling vs production spend at a glance
  • Spot your most expensive models and optimize
Usage Insights

Coding Tools

Dev tooling with BYOK configuration.

$246.45

28.6M tokens

Codexcodex-mini-latestOn-Demand18.2M$118.30
Clineclaude-opus-4.5On-Demand10.4M$128.15

Your App

Production workload with reserved capacity.

$1,704.00

10,500 TPM + 66.4M

Plannergpt-5.4Reserved3,400 TPM$720.00
Reasoningclaude-opus-4.6Reserved2,100 TPM$480.00
Summarizergemini-3.0-flashReserved5,000 TPM$240.00
Tool Usegpt-4.1On-Demand52M$156.00
Classifierclaude-sonnet-4.5On-Demand14.4M$108.00
Last used Mar 17, 4:30 PM

Why Choose Hicap?

Production-ready AI infrastructure with enterprise features built in.

Enterprise Security

We never store your prompts or completions. Your data passes through our gateway and is never retained.

Multi-Provider Flexibility

Access OpenAI, Anthropic, Google, and more through a single API endpoint. Switch models without changing providers.

Real-Time Analytics

Monitor usage, costs, and performance metrics in real-time with detailed dashboards.

Flexible Pricing

Pay as you go with no commitments, or lock in reserved throughput for even deeper savings.

Ready to start saving?

Create an account, swap your base URL, and start paying less for the same models. Setup takes under five minutes.