Transcribe or Translate

Transcribe audio to text, translate speech to another language, or generate translated audio output. This endpoint supports multiple modes: - Transcription only: Convert speech to text with optional word-level timestamps - Translation: Translate audio from one language to another - Speech-to-speech translation: Generate translated audio output

Authentication

X-Api-Keystring
API Key authentication via header

Request

This endpoint expects an object.
audio_urlstringRequired

URL to the audio file or base64-encoded audio as data URI (data:audio/wav;base64,…)

sampling_rateintegerRequiredDefaults to 16000
Audio sampling rate in Hz
target_languageenumOptional
Target language for translation
is_translatebooleanOptionalDefaults to false
Set to true to indicate translation request
return_translation_audiobooleanOptionalDefaults to false

If true, returns base64-encoded audio of the translated speech

temperaturedoubleOptional
Controls randomness in generation. Use 0.0 for deterministic output
max_tokensintegerOptional>=1Defaults to 1024
Maximum number of tokens to generate

Response

Successful transcription or translation
transcriptstring or null
Transcribed text in the original language
translationstring or null

Translated text in the target language (only if target_language is specified)

transcript_translation_with_timestamplist of objects or null

Word-level timestamps

audio_base64string or null

Base64-encoded WAV audio of the translated speech (only if return_translation_audio is true)

Errors