Transcribe or Translate
Transcribe audio to text, translate speech to another language, or generate translated audio output.
This endpoint supports multiple modes:
- Transcription only: Convert speech to text with optional word-level timestamps
- Translation: Translate audio from one language to another
- Speech-to-speech translation: Generate translated audio output
Authentication
X-Api-Keystring
API Key authentication via header
Request
This endpoint expects an object.
audio_url
URL to the audio file or base64-encoded audio as data URI (data:audio/wav;base64,…)
sampling_rate
Audio sampling rate in Hz
target_language
Target language for translation
is_translate
Set to true to indicate translation request
return_translation_audio
If true, returns base64-encoded audio of the translated speech
temperature
Controls randomness in generation. Use 0.0 for deterministic output
max_tokens
Maximum number of tokens to generate
Response
Successful transcription or translation
transcript
Transcribed text in the original language
translation
Translated text in the target language (only if target_language is specified)
transcript_translation_with_timestamp
Word-level timestamps
audio_base64
Base64-encoded WAV audio of the translated speech (only if return_translation_audio is true)