Our models can be accessed through the chat API. This page gives an introduction to using the chat API via the Python SDK.

Single Turn Prompting

A simple single turn request can be made as follows:

1from reka.client import Reka
3# You can also set the API key using the REKA_API_KEY environment variable.
4client = Reka(api_key="YOUR_API_KEY")
6response =
7 messages=[
8 {
9 "content": "Write a python one-liner to flatten a list of lists.",
10 "role": "user",
11 }
12 ],
13 model="reka-core-20240501",

This will return a response like:

Here's a Python one-liner using list comprehension to flatten a list of lists:
flattened_list = [item for sublist in nested_list for item in sublist]
Replace `nested_list` with your actual list of lists. This one-liner works by iterating over each sublist in the outer list, and then iterating over each item in those sublists, effectively flattening the structure into a single list.

See Available Models for details on valid model names.

Multiple Turn Conversations

You can request a response to a multiple turn conversation by adding more messages in the history. For example:

1from reka.client import Reka
3client = Reka(api_key="YOUR_API_KEY")
5response =
6 messages=[
7 {
8 "content": "My name is Matt.",
9 "role": "user",
10 },
11 {
12 "content": "Hello Matt! How can I help you today?",
13 "role": "assistant",
14 },
15 {
16 "content": "Can you think of a couple of famous people with the same name as me?",
17 "role": "user",
18 },
19 ],
20 model="reka-core-20240501",

This will return a response like:

Certainly, Matt is a popular name, and there are several famous individuals with that name across various fields. Here are a few:
1. **Matt Damon** - An acclaimed actor known for his roles in movies like "Good Will Hunting," the "Bourne" series, and "The Martian."
2. **Matt LeBlanc** - An actor best known for playing Joey Tribbiani on the television series "Friends" and for hosting "Top Gear."
3. **Matt Groening** - The creator of the iconic animated television series "The Simpsons" and "Futurama."
4. **Matt Smith** - An actor who played the Eleventh Doctor in the British television series "Doctor Who" and has also starred in "The Crown."
5. **Matt Bomer** - An actor, producer, and director known for his roles in "White Collar," "Magic Mike," and "The Normal Heart."
6. **Matt Ryan** - An actor known for his role as John Constantine in the television series "Constantine" and for voicing the character in various video games.
These are just a few examples of the many famous Matts out there. Each has made significant contributions to their respective fields.

Assistant Completions

We support guiding the assistant output (e.g. prompting it to output a structured JSON response), by allowing the developer to specify how the assistant response should start. This is done by adding a partial assistant response as the last message:

1from reka.client import Reka
3client = Reka(api_key="YOUR_API_KEY")
5prompt = """
6Below is a paragraph from wikipedia:
8The Solar System is the gravitationally bound system of the Sun and the objects that orbit it.
9The largest of such objects are the eight planets, in order from the Sun: four terrestrial planets named Mercury,
10Venus, Earth and Mars, two gas giants named Jupiter and Saturn, and two ice giants named Uranus and Neptune.
11The terrestrial planets have a definite surface and are mostly made of rock and metal. The gas giants are
12mostly made of hydrogen and helium, while the ice giants are mostly made of 'volatile' substances such as water,
13ammonia, and methane. In some texts, these terrestrial and giant planets are called the inner Solar System and outer
14Solar System planets respectively.
16Extract information about the planets from this paragraph as a JSON list of objects with keys 'planetName' and
17'composition'. The 'composition' key should contain one or two words, and there should be no other keys.
20json_prefix = """
22 {
23 "planetName":
26response =
27 messages=[
28 {"role": "user", "content": prompt},
29 {
30 "role": "assistant",
31 "content": (
32 "Sure, here is a JSON object conforming to that format:\n\n"
33 f"```json\n{json_prefix}"
34 ),
35 },
36 ],
37 max_tokens=512,
38 temperature=0.4,
39 stop=["```\n"],
40 model="reka-core-20240501",
42print(json_prefix + response.responses[0].message.content)

This will output:

2 {
3 "planetName": "Mercury",
4 "composition": "rock, metal"
5 },
6 {
7 "planetName": "Venus",
8 "composition": "rock, metal"
9 },
10 {
11 "planetName": "Earth",
12 "composition": "rock, metal"
13 },
14 {
15 "planetName": "Mars",
16 "composition": "rock, metal"
17 },
18 {
19 "planetName": "Jupiter",
20 "composition": "hydrogen, helium"
21 },
22 {
23 "planetName": "Saturn",
24 "composition": "hydrogen, helium"
25 },
26 {
27 "planetName": "Uranus",
28 "composition": "water, ammonia, methane"
29 },
30 {
31 "planetName": "Neptune",
32 "composition": "water, ammonia, methane"
33 }

Useful Parameters

The parameters of the chat API are fully documented in the API reference, but some particularly useful parameters are listed below:

  • temperature: Typically between 0 and 1. Values close to 0 will result in less varied generations, and higher values will result in more variation and creativity.
  • max_tokens: The maximum number of tokens that should be returned. Increase this if generations are being truncated, i.e. the finish_reason in the response is "length".
  • stop: A list of strings that should stop the generation. This can be used to stop after generating a code block, when reaching a certain number in a list etc.


The chat API supports streaming with the chat_stream function in the Python SDK, or by setting stream to true in the HTTP API. Below is an example of streaming in Python:

1from reka.client import Reka
3client = Reka(api_key="YOUR_API_KEY")
5response =
6 messages=[
7 ChatMessage(
8 content="Write a detailed template NDA contract between two parties.",
9 role="user",
10 )
11 ],
12 max_tokens=2048,
13 model="reka-core-20240501",
16for chunk in response:
17 print(chunk.responses[0].chunk.content)


The Python SDK also exports an async client so that you can make non-blocking calls to our API. This can be useful to make batch requests. The following code illustrates how to batch calls to the API, by creating a list of async tasks, and gathering them with asyncio.gather. The Semaphore limits the number of concurrent requests to the API.

1import asyncio
3from reka.client import AsyncReka
6client = AsyncReka(api_key="YOUR_API_KEY")
7max_concurrent_requests = 2
8semaphore = asyncio.Semaphore(max_concurrent_requests)
11async def respond(prompt: str) -> str:
12 async with semaphore:
13 response = await
14 messages=[
15 {
16 "content": prompt,
17 "role": "user",
18 }
19 ],
20 model="reka-flash",
21 )
22 return response.responses[0].message
25async def main():
26 prompts = [
27 "What is your name?",
28 "What is the fifth prime number?",
29 "Write a python one-liner to flatten a list of lists.",
30 ]
31 responses = await asyncio.gather(*[respond(prompt) for prompt in prompts])
32 return responses