Skip to main content

Documentation Index

Fetch the complete documentation index at: https://heygen-1fa696a7.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Starfish only works with Starfish-compatible voices. Not all HeyGen voices support this engine. Use GET /v3/voices?engine=starfish to get a list of compatible voices before calling this endpoint.

Quick Example

curl -X POST "https://api.heygen.com/v3/voices/speech" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello from HeyGen!",
    "voice_id": "1bd001e7e50f421d891986aad5c8bbd2"
  }'
Response
{
  "data": {
    "audio_url": "https://files.heygen.ai/audio/req_xyz789.mp3",
    "duration": 2.4,
    "request_id": "req_xyz789",
    "word_timestamps": [
      { "word": "Hello", "start": 0.0, "end": 0.45 },
      { "word": "from", "start": 0.45, "end": 0.72 },
      { "word": "HeyGen!", "start": 0.72, "end": 1.35 }
    ]
  }
}

Finding a Compatible Voice

Before calling this endpoint, find a Starfish-compatible voice_id:
curl -X GET "https://api.heygen.com/v3/voices?engine=starfish&language=English&gender=female" \
  -H "X-Api-Key: $HEYGEN_API_KEY"
See Browse Voices for full filtering and pagination details.

Parameters

ParameterTypeRequiredDefaultDescription
textstringYesText to synthesize (1–5,000 characters).
voice_idstringYesA Starfish-compatible voice ID.
input_typestringNo"text""text" for plain text or "ssml" for SSML markup.
speednumberNo1.0Speed multiplier (0.5–2.0).
languagestringNoauto-detectedBase language code (e.g. "en", "pt", "zh"). Auto-detected when omitted.
localestringNoBCP-47 locale tag (e.g. "en-US", "pt-BR"). Overrides language when set.

Response Fields

FieldTypeDescription
audio_urlstringURL of the generated audio file.
durationnumberDuration of the audio in seconds.
request_idstring or nullUnique identifier for this generation request.
word_timestampsarray or nullWord-level timing data — each entry has word, start, and end in seconds.

SSML Support

For finer control over pronunciation, pauses, and emphasis, set input_type to "ssml". Check support_pause on the voice object from GET /v3/voices to confirm the voice supports SSML break tags.
curl -X POST "https://api.heygen.com/v3/voices/speech" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "<speak>Welcome to HeyGen. <break time=\"500ms\"/> Let us get started.</speak>",
    "voice_id": "1bd001e7e50f421d891986aad5c8bbd2",
    "input_type": "ssml"
  }'