Text to Speech - HeyGen Documentation

Full schema: POST /v3/voices/speech.

Starfish only works with Starfish-compatible voices. Not all HeyGen voices support this engine. Use GET /v3/voices?engine=starfish — see Browse Voices — to get a list of compatible voices before calling this endpoint.

Free usage: authenticate the CLI with OAuth (heygen auth login --oauth) and calls to this endpoint draw from a free 600-second monthly text-to-speech quota instead of API credits. See Free usage for agents.

Quick Example

curl -X POST "https://api.heygen.com/v3/voices/speech" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello from HeyGen!",
    "voice_id": "1bd001e7e50f421d891986aad5c8bbd2"
  }'

import requests

resp = requests.post(
    "https://api.heygen.com/v3/voices/speech",
    headers={"X-Api-Key": HEYGEN_API_KEY},
    json={
        "text": "Hello from HeyGen!",
        "voice_id": "1bd001e7e50f421d891986aad5c8bbd2",
    },
)
data = resp.json()["data"]
print(data["audio_url"], data["duration"])

const resp = await fetch("https://api.heygen.com/v3/voices/speech", {
  method: "POST",
  headers: {
    "X-Api-Key": process.env.HEYGEN_API_KEY,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    text: "Hello from HeyGen!",
    voice_id: "1bd001e7e50f421d891986aad5c8bbd2",
  }),
});
const { data } = await resp.json();
console.log(data.audio_url, data.duration);

Response

{
  "data": {
    "audio_url": "https://files.heygen.ai/audio/req_xyz789.mp3",
    "duration": 2.4,
    "request_id": "req_xyz789",
    "word_timestamps": [
      { "word": "Hello", "start": 0.0, "end": 0.45 },
      { "word": "from", "start": 0.45, "end": 0.72 },
      { "word": "HeyGen!", "start": 0.72, "end": 1.35 }
    ]
  }
}

Finding a Compatible Voice

Before calling this endpoint, find a Starfish-compatible voice_id:

curl -X GET "https://api.heygen.com/v3/voices?engine=starfish&language=English&gender=female" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

See Browse Voices for full filtering and pagination details.

Parameters

Parameter	Type	Required	Default	Description
`text`	string	Yes	—	Text to synthesize (1–5,000 characters).
`voice_id`	string	Yes	—	A Starfish-compatible voice ID.
`input_type`	string	No	`"text"`	`"text"` for plain text or `"ssml"` for SSML markup.
`speed`	number	No	`1.0`	Speed multiplier (0.5–2.0).
`language`	string	No	auto-detected	Base language code (e.g. `"en"`, `"pt"`, `"zh"`). Auto-detected when omitted.
`locale`	string	No	—	BCP-47 locale tag (e.g. `"en-US"`, `"pt-BR"`). Overrides `language` when set.

Response Fields

Field	Type	Description
`audio_url`	string	URL of the generated audio file.
`duration`	number	Duration of the audio in seconds.
`request_id`	string or null	Unique identifier for this generation request.
`word_timestamps`	array or null	Word-level timing data — each entry has `word`, `start`, and `end` in seconds.

SSML Support

For finer control over pronunciation, pauses, and emphasis, set input_type to "ssml". Check the support_pause field on the voice object returned by GET /v3/voices to confirm whether the selected voice supports SSML <break> tags. This also works with POST /v3/videos when generating videos from scripts, letting you control pacing and pauses directly within the narration.\

curl -X POST "https://api.heygen.com/v3/voices/speech" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "<speak>Welcome to HeyGen. <break time=\"0.5s\"/> Let us get started.</speak>",
    "voice_id": "1bd001e7e50f421d891986aad5c8bbd2",
    "input_type": "ssml"
  }'

​Quick Example

​Finding a Compatible Voice

​Parameters

​Response Fields

​SSML Support

Quick Example

Finding a Compatible Voice

Parameters

Response Fields

SSML Support