Speech API Overview
Speech API
The Reka Speech API provides powerful audio processing capabilities, enabling you to transcribe audio, translate speech across languages, and generate translated audio output using AI-powered models.
Key Features
- Audio Transcription: Convert speech to text with word-level timestamps
- Speech Translation: Translate audio from one language to another with high accuracy
- Speech-to-Speech Translation: Generate translated audio output that preserves the original speaking style
- Multi-language Support: Support for English, French, Spanish, Japanese, Chinese, Korean, Italian, Portuguese, and German
Getting Started
- Encode to base64 or host the audio file and provide a URL
- Transcribe audio using the
/v1/transcription_or_translationendpoint - Translate speech by adding
target_languageparameter - Generate translated audio by setting
return_translation_audio: true
Authentication
All Speech API requests require authentication using an API key in the X-Api-Key header:
Base URL
Supported Languages
The Speech API supports translation between English and the following languages:
- French
- Spanish
- Japanese
- Chinese
- Korean
- Italian
- Portuguese
- German
Audio Format Requirements
- Format: WAV
- Sampling Rate: 16,000 Hz (recommended)
- Encoding: Base64 or URL to hosted audio file
- Input Methods:
audio_url: URL to audio file (http/https or data URI)- Base64-encoded audio as data URI:
data:audio/wav;base64,<base64_string>
Models
The Speech API supports different models for various use cases:
reka-tiny-asr: Fast and efficient model for transcriptionreka-spark: Advanced model for translation with high accuracy
Example: Prepare Audio
Before calling the API, you need to prepare your audio file:
Python
Rate Limits
Please contact us for information about rate limits and pricing for the Speech API.