Video Agent is the fastest way to create videos programmatically. Describe what you want in plain text, and the agent handles avatar selection, scripting, scene composition, and production — all in a single API call.Documentation Index
Fetch the complete documentation index at: https://heygen-1fa696a7.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
How It Works
Send a prompt
POST a text description to
POST /v3/video-agents. Optionally attach files, pick an avatar, or apply a style.Agent produces your video
The agent writes a script, selects visuals, and renders the video asynchronously. You receive a
session_id immediately, and a video_id once generation begins.Quick Start
Response
video_id is null on creation and is populated once the agent begins rendering. Poll GET /v3/video-agents/{session_id} to track progress and retrieve the video_id.Two Modes of Operation
Video Agent supports two workflows depending on how much control you need:| Mode | How to use | Best for |
|---|---|---|
Generate (mode: "generate") | POST /v3/video-agents — default | Fire-and-forget. Send a prompt, get a video. The agent auto-proceeds through the storyboard. |
Chat (mode: "chat") | POST /v3/video-agents with "mode": "chat" | Multi-turn interaction. The agent may pause for decisions (e.g. picking a voice), supports revisions and follow-up videos. |
Chat mode example
POST /v3/video-agents/{session_id} to send follow-up messages, answer the agent’s questions, or request revisions in a chat session.
Processing Time
Video generation is asynchronous. Processing times depend on video length, complexity, and your plan tier.
| Factor | Typical Range |
|---|---|
| Standard plans | 5x–10x the final video length (e.g. a 1-min video takes ~5–10 min) |
| Enterprise plans | Faster processing with priority queue access |
| Multi-scene | Each scene adds to total processing time |
| Peak hours | Processing may take longer during high-traffic periods |
- Use
callback_urlinstead of polling to reduce unnecessary API calls - Set reasonable poll intervals (10–30 seconds) if polling
- Display a progress indicator to end users based on the 5x–10x benchmark
Choosing the Right Video API
| Feature | Video Agent | Direct Video (v3) |
|---|---|---|
| Endpoint | POST /v3/video-agents | POST /v3/videos |
| Input | Natural language prompt | Structured JSON |
| Avatar selection | Agent chooses (or you override) | You specify |
| Script writing | Agent writes it | You write it |
| Best for | Quick prototypes, simple videos | Programmatic pipelines |
| Control level | Low (prompt-driven) | High (explicit) |
Key Concepts
Session — Every Video Agent request creates a session (session_id). Sessions track the agent’s work: prompt, storyboard, generated assets, and final video. Retrieve session state via GET /v3/video-agents/{session_id}.
Video ID — The video_id is populated once rendering begins. Poll GET /v3/videos/{video_id} for status and the final download URL.
Styles — Curated visual templates that control scene composition, pacing, and aesthetics. Browse them via GET /v3/video-agents/styles and pass a style_id to your request.
File attachments — Images, videos, audio, and PDFs you provide as context. The agent uses these as visual references or content sources. Pass them via the files array as url, asset_id, or base64 inputs.
Incognito mode — Set incognito_mode: true to disable memory injection and extraction for a session.
Error Handling
All Video Agent endpoints return errors in a consistent format:| Status | Meaning |
|---|---|
400 | Invalid request parameters. Check the param field for which field caused the error. |
401 | Authentication failed. Verify your API key or Bearer token. |
429 | Rate limit exceeded. Retry after the seconds specified in the Retry-After header. |
failure_code and failure_message on the video status response.
