Video Q&A

The Vision API provides powerful question-answering capabilities for your videos. Once a video has been indexed, you can ask natural language questions about its content and receive AI-powered answers.

Prerequisites

Before using Video Q&A, ensure your video has been successfully indexed:

  1. Upload a video with index=true
  2. Check indexing status - it should be "indexed"
  3. Wait for processing if status is "indexing"

Ask Questions

Bash

Use the /qa/chat endpoint to ask questions about your videos:

$curl -X POST https://vision-agent.api.reka.ai/qa/chat \
> -H "X-Api-Key: YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "video_id": "550e8400-e29b-41d4-a716-446655440000",
> "messages": [
> {
> "role": "user",
> "content": "What is happening in this video?"
> }
> ]
> }'

Python

1import requests
2import json
3
4url = f"{BASE_URL}/qa/chat"
5
6# Chat request body
7payload = {
8 "video_id": "550e8400-e29b-41d4-a716-446655440000",
9 "messages": [
10 {
11 "role": "user",
12 "content": "What is happening in this video?"
13 }
14 ]
15}
16
17headers = {
18 "X-Api-Key": REKA_API_KEY,
19 "Content-Type": "application/json",
20}
21
22response = requests.post(url, json=payload, headers=headers)
23print(response.status_code, response.json())

Request Parameters

  • video_id (optional): ID of the video to analyze
  • messages (optional): List of chat messages

Using Chat Messages

For multi-turn conversations, use the messages parameter:

Bash

$curl -X POST https://vision-agent.api.reka.ai/qa/chat \
> -H "X-Api-Key: YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "video_id": "550e8400-e29b-41d4-a716-446655440000",
> "messages": [
> {
> "role": "user",
> "content": "What is happening in this video?"
> },
> {
> "role": "assistant",
> "content": "The video shows a person walking on the beach during sunset."
> },
> {
> "role": "user",
> "content": "What is the color of the car in the video?"
> }
> ]
> }'

Python

1import requests
2import json
3
4url = f"{BASE_URL}/qa/chat"
5
6# Chat request body
7payload = {
8 "video_id": "550e8400-e29b-41d4-a716-446655440000",
9 "messages": [
10 {
11 "role": "user",
12 "content": "What is happening in this video?"
13 },
14 {
15 "role": "assistant",
16 "content": "The video shows a person walking on the beach during sunset."
17 },
18 {
19 "role": "user",
20 "content": "What is the color of the car in the video?"
21 }
22 ]
23}
24
25headers = {
26 "X-Api-Key": REKA_API_KEY,
27 "Content-Type": "application/json",
28}
29
30response = requests.post(url, json=payload, headers=headers)
31print(response.status_code, response.json())

Streaming Responses

For real-time responses, use the /qa/stream endpoint:

Bash

$curl -X POST https://vision-agent.api.reka.ai/qa/stream \
> -H "X-Api-Key: YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "video_id": "550e8400-e29b-41d4-a716-446655440000",
> "messages": [
> {
> "role": "user",
> "content": "What is happening in this video?"
> }
> ]
> }'

Python

1import requests
2import json
3
4url = f"{BASE_URL}/qa/stream"
5
6# Chat request body
7payload = {
8 "video_id": "550e8400-e29b-41d4-a716-446655440000",
9 "messages": [
10 {
11 "role": "user",
12 "content": "What is happening in this video?"
13 }
14 ]
15}
16
17headers = {
18 "X-Api-Key": REKA_API_KEY,
19 "Content-Type": "application/json",
20}
21
22response = requests.post(url, json=payload, headers=headers)
23print(response.status_code, response.json())

This returns a Server-Sent Events (SSE) stream with real-time updates.

Response Format

Chat Response

1{
2 "chat_response": "The video shows a person walking on the beach during sunset.",
3 "status": "success",
4}

Stream Response

1{
2 "event": "qa_stream",
3 "data": {
4 "chat_response": "The video shows a person walking on the beach during sunset.",
5 "status": "success",
6 }
7}

Question Examples

Here are some example questions you can ask:

  • General: “What is happening in this video?”
  • Specific: “What color is the car in the video?”
  • Temporal: “What happens at the beginning of the video?”
  • Analytical: “How many people are in the scene?”
  • Descriptive: “Describe the setting and atmosphere”

Best Practices

  1. Be specific in your questions for better answers
  2. Check indexing status before asking questions
  3. Use streaming for long videos or complex questions

Error Handling

  • Video not found: Ensure the video_id is correct
  • Video not indexed: Wait for indexing to complete
  • Indexing failed: Re-upload the video with index=true