Speech Translation

The Speech API provides speech translation capabilities that convert audio from one language to another while preserving meaning and context. This is useful for multilingual content, international communication, and accessibility.

Translate Speech

Translate audio from one language to another with high accuracy.

Endpoint

POST /v1/transcription_or_translation

Request

Bash

$ curl -X POST https://api.reka.ai/v1/transcription_or_translation \
>   -H "X-Api-Key: YOUR_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "audio_url": "data:audio/wav;base64,<your_base64_encoded_audio>",
>     "sampling_rate": 16000,
>     "target_language": "chinese",
>     "parallel_mode": true,
>     "temperature": 0.0,
>     "max_tokens": 1024
>   }'

Python

1 import base64
2 import io
3 import httpx
4 import librosa
5 import soundfile
6 
7 REKA_API_KEY = "YOUR_API_KEY"
8 SAMPLING_RATE = 16_000
9 
10 # Prepare audio
11 with soundfile.SoundFile("/path/to/audio.wav") as sound_file:
12     waveform, _ = librosa.load(
13         sound_file,
14         sr=SAMPLING_RATE,
15     )
16     cache = io.BytesIO()
17     soundfile.write(cache, waveform, SAMPLING_RATE, format="WAV")
18     cache.seek(0)
19     audio_in_base64 = base64.b64encode(cache.read()).decode("ascii")
20 
21 audio_url = f"data:audio/wav;base64,{audio_in_base64}"
22 
23 # Make request
24 with httpx.Client(timeout=180, follow_redirects=True) as client:
25     response = client.request(
26         method="POST",
27         url="https://api.reka.ai/v1/transcription_or_translation",
28         json={
29             "audio_url": audio_url,
30             "sampling_rate": SAMPLING_RATE,
31             "target_language": "chinese",
32             "temperature": 0.0,
33             "max_tokens": 1024,
34             "is_translate": True
35         },
36         headers={
37             "X-Api-Key": REKA_API_KEY,
38         },
39     )
40     result = response.json()
41     print("Original transcript:", result["transcript"])
42     print("Translation:", result["translation"])

Parameters

audio_url (required): URL to the audio file or base64-encoded audio as data URI
sampling_rate (required): Audio sampling rate in Hz (recommended: 16000)
target_language (required): Target language for translation. Supported: "french", "spanish", "japanese", "chinese", "korean", "italian", "portuguese", "german"
temperature (optional): Controls randomness in generation. Use 0.0 for deterministic output. Default: 0.0
max_tokens (optional): Maximum number of tokens to generate. Default: 1024
is_translate (required): Set to true to indicate translation request

Response

Returns a translation result with:

1 {
2   "transcript": "Original transcribed text in source language",
3   "translation": "Translated text in target language"
4 }

transcript: Transcribed text in the original language
translation: Translated text in the target language

Supported Languages

The Speech API supports translation between English and the following languages:

French ("french")
Spanish ("spanish")
Japanese ("japanese")
Chinese ("chinese")
Korean ("korean")
Italian ("italian")
Portuguese ("portuguese")
German ("german")

Use Cases

Multilingual Content: Translate videos, podcasts, or audio content for international audiences
International Communication: Enable real-time communication across language barriers
Accessibility: Make content accessible to speakers of different languages
Localization: Adapt audio content for different markets

Performance Tips

Set temperature: 0.0 for consistent, deterministic translations
Use the recommended sampling rate of 16000 Hz for best results