Speech-to-Speech Translation
Speech-to-Speech Translation
Speech-to-Speech Translation
The Speech API provides speech-to-speech translation capabilities that not only translate the content but also generate audio output in the target language. This allows you to create fully translated audio content while preserving the natural flow of speech.
Translate audio from one language to another and receive both text and audio output.
POST /v1/transcription_or_translationaudio_url (required): URL to the audio file or base64-encoded audio as data URIsampling_rate (required): Audio sampling rate in Hz (recommended: 16000)target_language (required): Target language for translation. Supported: "french", "spanish", "japanese", "chinese", "korean", "italian", "portuguese", "german"return_translation_audio (required): Set to true to receive translated audio outputtemperature (optional): Controls randomness in generation. Use 0.0 for deterministic output. Default: 0.0max_tokens (optional): Maximum number of tokens to generate. Default: 1024is_translate (required): Set to true to indicate translation requestReturns a translation result with audio:
transcript: Transcribed text in the original languagetranslation: Translated text in the target languageaudio_base64: Base64-encoded WAV audio of the translated speechtemperature: 0.0 for consistent, deterministic output