Speech Translation
The Speech API provides speech translation capabilities that convert audio from one language to another while preserving meaning and context. This is useful for multilingual content, international communication, and accessibility.
Translate Speech
Translate audio from one language to another with high accuracy.
Endpoint
- POST /v1/transcription_or_translation
Request
Bash
Python
Parameters
- audio_url(required): URL to the audio file or base64-encoded audio as data URI
- sampling_rate(required): Audio sampling rate in Hz (recommended: 16000)
- target_language(required): Target language for translation. Supported:- "french",- "spanish",- "japanese",- "chinese",- "korean",- "italian",- "portuguese",- "german"
- temperature(optional): Controls randomness in generation. Use 0.0 for deterministic output. Default: 0.0
- max_tokens(optional): Maximum number of tokens to generate. Default: 1024
- is_translate(required): Set to- trueto indicate translation request
Response
Returns a translation result with:
- transcript: Transcribed text in the original language
- translation: Translated text in the target language
Supported Languages
The Speech API supports translation between English and the following languages:
- French ("french")
- Spanish ("spanish")
- Japanese ("japanese")
- Chinese ("chinese")
- Korean ("korean")
- Italian ("italian")
- Portuguese ("portuguese")
- German ("german")
Use Cases
- Multilingual Content: Translate videos, podcasts, or audio content for international audiences
- International Communication: Enable real-time communication across language barriers
- Accessibility: Make content accessible to speakers of different languages
- Localization: Adapt audio content for different markets
Performance Tips
- Set temperature: 0.0for consistent, deterministic translations
- Use the recommended sampling rate of 16000 Hz for best results