| Video Agent | Direct Video | |
|---|---|---|
| Endpoint | POST /v3/video-agents | POST /v3/videos |
| Input | Natural language prompt | Structured JSON |
| Script writing | Agent writes it | You write it |
| Avatar selection | Agent picks (or you override) | You specify |
| Voice selection | Agent picks (or you override) | You specify |
| Interactive iteration | ✅ Via chat mode | ❌ |
| Webhook support | ✅ callback_url | ✅ callback_url |
| Control level | Low (prompt-driven) | High (explicit) |
Video Agent — best for speed
Send a text prompt, get a video. The agent handles scripting, avatar selection, and scene composition automatically.- You want a video fast without managing avatars or scripts
- You’re building a product where end users describe videos in natural language
- You want to iterate interactively — use
mode: "chat"to review the storyboard before rendering
Direct Video — best for control
Explicitly specify the avatar, voice, and script. Predictable, repeatable output for automated pipelines.- Building automated pipelines (personalized sales videos, daily reports)
- You need exact control over avatar, voice, and script
- Generating videos programmatically from data (CRM records, form submissions)
Not sure which to pick?
Start with Video Agent. If you need precise control over the script, avatar, or timing, switch toPOST /v3/videos.
You can also combine both — use Video Agent to explore ideas and find the right style, then recreate with explicit parameters for the final production version.
