Overview

Video Agent is the fastest way to create videos programmatically. Describe what you want in plain text, and the agent handles avatar selection, scripting, scene composition, and production — all in a single API call.

How It Works

Send a prompt

POST a text description to POST /v3/video-agents. Optionally attach files, pick an avatar, or apply a style.

Agent produces your video

The agent writes a script, selects visuals, and renders the video asynchronously. You receive a session_id immediately, and a video_id once generation begins.

Retrieve the result

Poll GET /v3/videos/{video_id} until status is completed, then download via video_url. Or use a callback_url to get notified automatically.

Quick Start

curl -X POST "https://api.heygen.com/v3/video-agents" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Create a 30-second product walkthrough for a new project management app"
  }'

Response

{
  "data": {
    "session_id": "sess_abc123",
    "status": "generating",
    "video_id": null,
    "created_at": 1711382400
  }
}

video_id is null on creation and is populated once the agent begins rendering. Poll GET /v3/video-agents/{session_id} to track progress and retrieve the video_id.

Two Modes of Operation

Video Agent supports two workflows depending on how much control you need:

Mode	How to use	Best for
Generate (`mode: "generate"`)	`POST /v3/video-agents` — default	Fire-and-forget. Send a prompt, get a video. The agent auto-proceeds through the storyboard.
Chat (`mode: "chat"`)	`POST /v3/video-agents` with `"mode": "chat"`	Multi-turn interaction. The agent may pause for decisions (e.g. picking a voice), supports revisions and follow-up videos.

Both modes support the same file inputs, avatar/voice overrides, and style options.

Chat mode example

{
  "prompt": "Create a product walkthrough for our new app",
  "mode": "chat"
}

Use POST /v3/video-agents/{session_id} to send follow-up messages, answer the agent’s questions, or request revisions in a chat session.

Processing Time

Video generation is asynchronous. Processing times depend on video length, complexity, and your plan tier.

Factor	Typical Range
Standard plans	5x–10x the final video length (e.g. a 1-min video takes ~5–10 min)
Enterprise plans	Faster processing with priority queue access
Multi-scene	Each scene adds to total processing time
Peak hours	Processing may take longer during high-traffic periods

If a video has been processing for more than 24 hours, something is likely wrong. Contact HeyGen Support with your video_id.

Best practices:

Use callback_url instead of polling to reduce unnecessary API calls
Set reasonable poll intervals (10–30 seconds) if polling
Display a progress indicator to end users based on the 5x–10x benchmark

Choosing the Right Video API

Feature	Video Agent	Direct Video (`v3`)
Endpoint	`POST /v3/video-agents`	`POST /v3/videos`
Input	Natural language prompt	Structured JSON
Avatar selection	Agent chooses (or you override)	You specify
Script writing	Agent writes it	You write it
Best for	Quick prototypes, simple videos	Programmatic pipelines
Control level	Low (prompt-driven)	High (explicit)

Start with Video Agent. If you need precise control over script, avatar, or timing, use POST /v3/videos directly.

Key Concepts

Session — Every Video Agent request creates a session (session_id). Sessions track the agent’s work: prompt, storyboard, generated assets, and final video. Retrieve session state via GET /v3/video-agents/{session_id}. Video ID — The video_id is populated once rendering begins. Poll GET /v3/videos/{video_id} for status and the final download URL. Styles — Curated visual templates that control scene composition, pacing, and aesthetics. Browse them via GET /v3/video-agents/styles and pass a style_id to your request. File attachments — Images, videos, audio, and PDFs you provide as context. The agent uses these as visual references or content sources. Pass them via the files array as url, asset_id, or base64 inputs. Incognito mode — Set incognito_mode: true to disable memory injection and extraction for a session.

Error Handling

All Video Agent endpoints return errors in a consistent format:

{
  "error": {
    "code": "invalid_parameter",
    "message": "'prompt' is required and must be 1-10000 characters.",
    "param": "prompt",
    "doc_url": null
  }
}

Status	Meaning
`400`	Invalid request parameters. Check the `param` field for which field caused the error.
`401`	Authentication failed. Verify your API key or Bearer token.
`429`	Rate limit exceeded. Retry after the seconds specified in the `Retry-After` header.

For video-specific failures (e.g. rendering errors), check failure_code and failure_message on the video status response.

Auth

User Info

Pricing

Video Agent

Video Generation

Video Translation

Avatars

Voices

Lipsync

Webhook

Assets

Integrations

Legacy APIs

Limits

How It Works

Quick Start

Two Modes of Operation

Processing Time

Choosing the Right Video API

Key Concepts

Error Handling

Auth

User Info

Pricing

Video Agent

Video Generation

Video Translation

Avatars

Voices

Lipsync

Webhook

Assets

Integrations

Legacy APIs

Limits

Documentation Index

​How It Works

​Quick Start

​Two Modes of Operation

​Processing Time

​Choosing the Right Video API

​Key Concepts

​Error Handling

How It Works

Quick Start

Two Modes of Operation

Processing Time

Choosing the Right Video API

Key Concepts

Error Handling