Quickstart

This page shows you the steps needed to get started with the Reka API and running the Reka Edge model locally.

  • Run models on the Reka Platform
  • Run Reka Edge locally (macOS)
  • Run Reka Edge locally using OpenAI-compatible server (Linux)

Run models on the Reka Platform

Create a free account on the Reka Platform to access your API key.

Keep your API key secure. Never expose it in client-side code or share it publicly.

Install the SDK

Install the OpenAI Python SDK — our API is fully OpenAI-compatible.

$pip install openai

Make your first request

First request
1from openai import OpenAI
2
3client = OpenAI(
4 base_url="https://api.reka.ai/v1",
5 api_key="YOUR_API_KEY",
6)
7
8response = client.chat.completions.create(
9 model="reka-flash",
10 messages=[
11 {
12 "role": "user",
13 "content": [
14 {"type": "image_url", "image_url": {"url": "https://v0.docs.reka.ai/_images/000000245576.jpg"}},
15 {"type": "text", "text": "What do you like about this image?"}
16 ],
17 }
18 ],
19)
20print(response.choices[0].message.content)

Run Reka Edge locally (macOS)

See our HuggingFace repository for instructions on running Reka Edge locally.

Requirements

  • OS: macOS 13+
  • Hardware: Apple Silicon Mac with 32 GB+ unified memory (M1 Pro/Max or later recommended)
  • Python: 3.12+
  • uv (recommended) — handles dependencies automatically

Run Reka Edge locally using OpenAI-compatible server (Linux)

For high-throughput serving, you can use the vllm-reka plugin that extends standard vLLM to support Reka’s custom architectures and optimized tokenizer. Please follow our vllm-reka installation instructions to install the plugin along with vLLM.

Requirements

  • OS: Linux with CUDA. macOS is not supported for serving.
  • Hardware: NVIDIA GPU, ideally with ≥24 GB VRAM. This has been tested to work on GTX 3090 GPUs with 40-50 tokens/s.
  • Python: 3.10 ≥ x > 3.14
  • vLLM: 0.15.x (0.15.0 ≥ x > 0.16.0)

Next up

Explore the Reka API’s capabilities through the following guides: