Speech API Overview
Speech API
The Reka Speech API provides powerful audio processing capabilities, enabling you to transcribe audio, translate speech across languages, and generate translated audio output using AI-powered models.
Key Features
- Audio Transcription: Convert speech to text with word-level timestamps
- Speech Translation: Translate audio from one language to another with high accuracy
- Speech-to-Speech Translation: Generate translated audio output that preserves the original speaking style
- Multi-language Support: Support for English, French, Spanish, Japanese, Chinese, Korean, Italian, Portuguese, and German
Getting Started
- Encode to base64 or host the audio file and provide a URL
- Transcribe audio using the /v1/transcription_or_translationendpoint
- Translate speech by adding target_languageparameter
- Generate translated audio by setting return_translation_audio: true
Authentication
All Speech API requests require authentication using an API key in the X-Api-Key header:
Base URL
Supported Languages
The Speech API supports translation between English and the following languages:
- French
- Spanish
- Japanese
- Chinese
- Korean
- Italian
- Portuguese
- German
Audio Format Requirements
- Format: WAV
- Sampling Rate: 16,000 Hz (recommended)
- Encoding: Base64 or URL to hosted audio file
- Input Methods:
- audio_url: URL to audio file (http/https or data URI)
- Base64-encoded audio as data URI: data:audio/wav;base64,<base64_string>
 
Models
The Speech API supports different models for various use cases:
- reka-tiny-asr: Fast and efficient model for transcription
- reka-spark: Advanced model for translation with high accuracy
Example: Prepare Audio
Before calling the API, you need to prepare your audio file:
Python
Rate Limits
Please contact us for information about rate limits and pricing for the Speech API.