Video Q&A
The Vision API provides powerful question-answering capabilities for your videos. Once a video has been indexed, you can ask natural language questions about its content and receive AI-powered answers.
The Video Q&A API is designed for videos longer than 30 seconds. It indexes your video for efficient retrieval and analysis across longer content.
For videos 30 seconds or shorter, use the Chat API with multimodal inputs instead — it accepts video directly in the conversation without requiring a separate indexing step.
Prerequisites
Before using Video Q&A, ensure your video has been successfully indexed:
- Upload a video with
index=true - Check indexing status - it should be
"indexed" - Wait for processing if status is
"indexing"
Ask Questions
Bash
Use the /v1/qa/chat endpoint to ask questions about your videos:
Python
Request Parameters
video_id(optional): ID of the video to analyzemessages(optional): List of chat messages
Using Chat Messages
For multi-turn conversations, use the messages parameter:
Bash
Python
Streaming Responses
For real-time responses, set stream=true in your request:
Bash
Python
This returns a Server-Sent Events (SSE) stream with real-time updates.
Response Format
Chat Response
Stream Response
When the stream is complete, a final event is sent:
Question Examples
Here are some example questions you can ask:
- General: “What is happening in this video?”
- Specific: “What color is the car in the video?”
- Temporal: “What happens at the beginning of the video?”
- Analytical: “How many people are in the scene?”
- Descriptive: “Describe the setting and atmosphere”
Best Practices
- Be specific in your questions for better answers
- Check indexing status before asking questions
- Use streaming for long videos or complex questions
Error Handling
- Video not found: Ensure the video_id is correct
- Video not indexed: Wait for indexing to complete
- Indexing failed: Re-upload the video with
index=true