Vision AI Agents Developer Platform
APIs and architecture for ingesting, analyzing, indexing, and searching video intelligence.
Vision AI Agents enables developers and enterprises to ingest video content, run intelligence analysis, generate searchable indexes, and build video discovery applications powered by structured metadata and vector embeddings.
Platform Pipeline
The Vision AI Agents platform follows a structured intelligence pipeline.
Video Upload
↓
Video Ingest APIs
↓
Video Intelligence Analysis
↓
Optional Audience Testing
↓
Metadata + Vector Embeddings
↓
Search APIs
↓
Application Results
All downstream operations require the video_id generated during video ingest.
Quick Navigation
- Platform Overview
- Platform Workflows
- Video Ingest APIs
- Video Intelligence APIs
- Audience Testing APIs
- Search APIs
- Getting Started
- Platform Architecture
- Authentication
- Rate Limits & Usage Tiers
- Error Handling
- Search Integration Guide
- Audience Testing Guide
- Job Status & Polling Guide
Platform Overview
Vision AI Agents provides a complete pipeline for video intelligence:
- Video ingest
- Video analysis and signal extraction
- Metadata and vector indexing
- Search and retrieval
- Audience testing and engagement analytics
Developers can integrate the platform in two ways:
- Use the Vision AI Agents hosted search and UI
- Integrate Vision AI Agents APIs into their own applications
Platform Workflows
The platform supports four primary workflows.
1. Upload and Analyze
Developers upload one or many videos to the platform.
The system automatically:
- generates a unique video_id
- runs intelligence analysis
- extracts structured signals
- creates searchable indexes
Developers may request:
- full analysis
- selective analytics modules
2. Analyze Existing Videos
If videos already exist in the platform, developers can reference them using the generated video_id.
Example request flow:
- Send analysis request
- Provide video ID
- Select analytics modules
The platform returns structured analysis results.
3. Batch Video Ingest
Developers can ingest large video libraries using batch upload APIs.
The platform processes videos asynchronously and:
- generates video IDs
- runs analysis pipelines
- queues processing jobs
Developers can check processing status using the job status polling endpoint.
4. Search Delivery
After videos are analyzed and indexed, developers can retrieve results using two models.
Hosted Search Experience
Developers use the Vision AI Agents hosted UI and search backend.
External Application Integration
Developers can integrate Vision AI Agents search APIs directly into their own applications.
Applications send search queries to the API and Vision AI Agents returns indexed results.
Video Ingest APIs
Video ingest APIs are the entry point to the platform.
All analysis and search operations require a system-generated video_id returned by the ingest endpoint.
Example endpoint:
POST /api/video/ingest
Capabilities include:
- single video upload
- batch video ingest
- automatic video ID generation
- automatic analysis pipeline initiation
Video Intelligence APIs
Video intelligence APIs allow developers to run analytics modules on previously ingested videos.
Example request:
POST /api/video/analyze
Developers can request:
- full analysis
- selective analytics modules
Video Intelligence Domains
Vision AI Agents extracts intelligence signals across multiple analysis domains.
Scene Actor Analytics
- actor emotion engagement
- actor eye contact engagement
- actor attention intensity
Scene Elements Analysis
Scene element extraction includes:
- audio genre recognition
- audio stems and rhythm analysis
- script linguistics analysis
- color traversal patterns
Scene Psychology Analysis
Psychological signals extracted include:
- color emotion classification
- audio emotion classification
- script sentiment classification
- audience-to-actor emotional mirroring
Crescendo Detection
Vision AI Agents identifies narrative and emotional crescendos across video content.
Signals include:
- color crescendo patterns
- audio crescendo patterns
- script crescendo patterns
- scene crescendo synchronization
- actor emotion synchronization
Audience Testing APIs
Audience testing must be explicitly requested through the API.
Audience testing requires a valid video_id generated by the ingest process.
Example request:
POST /api/audience/test
Request Parameters
| Parameter | Type | Description |
|---|---|---|
| video_id | string | Video identifier returned from the ingest endpoint |
| participants | integer | Number of participants in the audience test (maximum 10) |
| analytics | array | List of requested audience analytics modules |
Audience Analytics Modules
| Module | Description |
|---|---|
| emotion | Measures aggregated emotional responses to video scenes |
| attention | Measures viewer attention intensity and focus during scenes |
| engagement | Identifies high engagement moments within the video |
| dropoff | Detects scenes where viewer engagement declines |
The platform returns aggregated engagement signals based on the selected analytics modules.
Search APIs
Search APIs allow developers to retrieve indexed video intelligence signals.
Example endpoint:
POST /api/search/query
Search capabilities include:
- metadata search
- vector similarity search
- structured filtering
Search results can power:
- Vision AI Agents hosted search UI
- external developer applications
Authentication
All API requests require authentication using API keys.
Example header:
Authorization: Bearer API_KEY
Developers generate and manage API keys through the platform dashboard.
Rate Limits
API usage is governed by developer usage tiers.
Limits include:
- concurrent requests
- daily request limits
- monthly token limits
Developers can upgrade tiers to increase throughput and processing capacity.
Getting Started
To begin integrating Vision AI Agents:
- Obtain API credentials
- Upload a video using the ingest API
- Run intelligence analysis
- Optionally request audience testing
- Retrieve indexed results through the search APIs