Synthesize speech audio from text using a specified voice. Returns a URL to the generated audio file along with duration and optional word-level timestamps.
HeyGen API key. Obtain from your HeyGen dashboard.
Request body for POST /v1/audio/text_to_speech.
Text to synthesize (1-5000 characters).
1 - 5000Voice ID to use. Discover available voices via GET /v1/audio/voices.
Type of the input: 'text' for plain text, 'ssml' for SSML markup. Defaults to 'text'.
Speed multiplier (0.5-2.0).
0.5 <= x <= 2Base language code (e.g. 'en', 'pt', 'zh'). Optional — auto-detected from text when omitted.
BCP-47 locale tag (e.g. 'en-US', 'pt-BR'). When set, language is inferred from locale.
Successful response
Response payload for POST /v1/audio/text_to_speech.