How it works Models Voice Developers Business Docs

Voice Models on the
Same Hicap Workflow

Run ElevenLabs text-to-speech and speech-to-text through Hicap without adding a separate auth flow, endpoint surface, or billing path. Keep the same Hicap base URL and ship voice next to the rest of your AI stack.

Get Started Browse Audio Models

Voice quickstart

Same base URL. Same auth header. Voice added cleanly.

The integration model stays simple: route requests through Hicap, keep using your Hicap key, and target the ElevenLabs-compatible voice endpoints you need.

Point requests at https://api.hicap.ai/v1.

Send your Hicap key in the api-key header.

Use ElevenLabs model IDs for TTS and STT requests.

Text to speech

Eleven v3

Expressive generation across 70+ languages

Long-form voice

Multilingual v2

Stable multilingual output across 29 languages

Speech to text

Scribe v2

High-accuracy transcription with diarization support

Platform

What Stays the Same

The point of the voice route is consolidation, not a separate setup track. Teams already using Hicap should not have to think about voice as a second platform.

Same Hicap base URL

Keep requests on https://api.hicap.ai/v1 and authenticate with the same api-key header you already use for chat and other model traffic.

ElevenLabs request shape

Use the ElevenLabs-style voice paths and model IDs while routing traffic through Hicap instead of wiring up a separate voice integration.

One platform for AI + voice

Keep billing, access, and operational routing in one place whether your app is generating text, audio, or transcripts.

Text to Speech

Choose the Right Voice Model

Hicap currently exposes ElevenLabs' main speech generation models so teams can cover both expressive voice work and steadier long-form narration from one route.

Eleven v3

Expressive speech synthesis

Best fit when voice tone, character, and performance matter. ElevenLabs positions Eleven v3 as its most emotionally rich text-to-speech model.

70+ supported languages

Built for dynamic, expressive delivery

Multi-speaker dialogue support

Up to 5,000 characters per request

Eleven Multilingual v2

Stable long-form generation

A steadier option for narration, explainers, and multilingual production where consistency over longer passages matters more than theatrical range.

29 supported languages

Natural long-form generation

Consistent multilingual delivery

Up to 10,000 characters per request

Speech to Text

Transcription Models for Production Audio

Both Scribe models are available through Hicap, from broad language coverage to newer transcription features like speaker diarization and transcript cleanup.

Scribe v1

Broad language coverage

A straightforward speech-to-text option for turning recorded audio into searchable text across a wide language set.

90+ supported languages

Word-level timestamps

Audio and video transcription

Available through the same Hicap gateway

Scribe v2

Higher-accuracy transcription

The more capable transcription option for production workflows that need better recognition, speaker separation, and cleaner transcripts.

Keyterm prompting up to 1000 terms

Speaker diarization up to 32 speakers

Dynamic audio tagging

Optional transcript cleanup

Quickstart

Same Hicap URL. Same api-key Header.

These examples keep the ElevenLabs endpoint shapes and model IDs while moving authentication and routing onto Hicap.

Text to Speech

Generate audio through Hicap

bash

curl --request POST \\
  --url "https://api.hicap.ai/v1/text-to-speech/JBFqnCBsd6RMkjVDRZzb" \\
  --header "Content-Type: application/json" \\
  --header "api-key: $HICAP_API_KEY" \\
  --data '{
    "text": "The first move is what sets everything in motion.",
    "model_id": "eleven_v3"
  }' \\
  --output speech.mp3

Replace JBFqnCBsd6RMkjVDRZzb with the ElevenLabs voice ID you want to use. If you need a different response format, follow the ElevenLabs-compatible request options while keeping the Hicap base URL and auth header.

Speech to Text

Transcribe files through Hicap

bash

curl --request POST \\
  --url "https://api.hicap.ai/v1/speech-to-text" \\
  --header "api-key: $HICAP_API_KEY" \\
  --form "file=@./meeting.mp3" \\
  --form "model_id=scribe_v2"

Send audio or video with multipart form data and switch the model_id between scribe_v1 and scribe_v2based on the transcription quality and feature set you need.

Current Hicap voice coverage includes Eleven v3 and Eleven Multilingual v2 for generation, plus Scribe v1 and Scribe v2 for transcription. That keeps the voice surface focused and predictable while the rest of the Hicap model catalog remains available through the same account.

Ready to add voice to your stack?

Bring speech generation and transcription into the same Hicap workflow your team already understands.

Start Building Developer Overview

Voice Models on theSame Hicap Workflow

Same base URL. Same auth header. Voice added cleanly.

What Stays the Same

Same Hicap base URL

ElevenLabs request shape

One platform for AI + voice

Choose the Right Voice Model

Eleven v3

Eleven Multilingual v2

Transcription Models for Production Audio

Scribe v1

Scribe v2

Same Hicap URL. Same api-key Header.

Text to Speech

Speech to Text

Ready to add voice to your stack?

Voice Models on the
Same Hicap Workflow