Speech API

The Reka Speech API provides powerful audio processing capabilities, enabling you to transcribe audio, translate speech across languages, and generate translated audio output using AI-powered models.

Key Features

Audio Transcription: Convert speech to text with word-level timestamps
Speech Translation: Translate audio from one language to another with high accuracy
Speech-to-Speech Translation: Generate translated audio output that preserves the original speaking style
Multi-language Support: Support for English, French, Spanish, Japanese, Chinese, Korean, Italian, Portuguese, and German

Getting Started

Encode to base64 or host the audio file and provide a URL
Transcribe audio using the /v1/transcription_or_translation endpoint
Translate speech by adding target_language parameter
Generate translated audio by setting return_translation_audio: true

Authentication

All Speech API requests require authentication using an API key in the X-Api-Key header:

$ X-Api-Key: YOUR_API_KEY

Base URL

https://api.reka.ai

Supported Languages

The Speech API supports translation between English and the following languages:

French
Spanish
Japanese
Chinese
Korean
Italian
Portuguese
German

Audio Format Requirements

Format: WAV
Sampling Rate: 16,000 Hz (recommended)
Encoding: Base64 or URL to hosted audio file
Input Methods:
- audio_url: URL to audio file (http/https or data URI)
- Base64-encoded audio as data URI: data:audio/wav;base64,<base64_string>

Models

The Speech API supports different models for various use cases:

reka-tiny-asr: Fast and efficient model for transcription
reka-spark: Advanced model for translation with high accuracy

Example: Prepare Audio

Before calling the API, you need to prepare your audio file:

Python

1 import base64
2 import io
3 import librosa
4 import soundfile
5 
6 SAMPLING_RATE = 16_000
7 
8 with soundfile.SoundFile("/path/to/audio.wav") as sound_file:
9     waveform, _ = librosa.load(
10         sound_file,
11         sr=SAMPLING_RATE,
12     )
13     cache = io.BytesIO()
14     soundfile.write(cache, waveform, SAMPLING_RATE, format="WAV")
15     cache.seek(0)
16     audio_in_base64 = base64.b64encode(cache.read()).decode("ascii")
17 
18 audio_url = f"data:audio/wav;base64,{audio_in_base64}"

Rate Limits

Please contact us for information about rate limits and pricing for the Speech API.