Create speech audio from text

curl --request POST \ --url https://api.heygen.com/v3/voices/speech \ --header 'Content-Type: application/json' \ --header 'x-api-key: <api-key>' \ --data ' { "text": "<string>", "voice_id": "<string>", "input_type": "text", "speed": 1, "language": "<string>", "locale": "<string>" } '

Authorizations

x-api-key

string

header

required

HeyGen API key. Obtain from your HeyGen dashboard.

Body

application/json

Request body for POST /v1/audio/text_to_speech.

text

string

required

Text to synthesize (1-5000 characters).

Required string length: 1 - 5000

voice_id

string

required

Voice ID to use. Discover available voices via GET /v1/audio/voices.

input_type

string

default:text

Type of the input: 'text' for plain text, 'ssml' for SSML markup. Defaults to 'text'.

speed

number

default:1

Speed multiplier (0.5-2.0).

Required range: 0.5 <= x <= 2

language

string | null

Base language code (e.g. 'en', 'pt', 'zh'). Optional — auto-detected from text when omitted.

locale

string | null

BCP-47 locale tag (e.g. 'en-US', 'pt-BR'). When set, language is inferred from locale.

Response

Successful response

data

TextToSpeechResponseData · object

Response payload for POST /v1/audio/text_to_speech.

Show child attributes

Video Agent

Avatars

Voices

Videos

Video Translate

Overdub

User

Webhooks

Assets

Authorizations

Body

Response