Audio To Text
Transcribe audio files to text.
The default Gateway used in this guide is the public
Livepeer.cloud Gateway. It is free to use but
not intended for production-ready applications. For production-ready
applications, consider using the Livepeer Studio
Gateway, which requires an API token. Alternatively, you can set up your own
Gateway node or partner with one via the ai-video
channel on
Discord.
Please note that the exact parameters, default values, and responses may vary between models. For more information on model-specific parameters, please refer to the respective model documentation available in the audio-to-text pipeline. Not all parameters might be available for a given model.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
Uploaded audio file to be transcribed.
Hugging Face model ID used for transcription.
Return timestamps for the transcribed text. Supported values: 'sentence', 'word', or a string boolean ('true' or 'false'). Default is 'true' ('sentence'). 'false' means no timestamps. 'word' means word-based timestamps.
Was this page helpful?