Text To Speech - Livepeer Docs

TypeScript

import { Livepeer } from "@livepeer/ai";

const livepeer = new Livepeer({
  httpBearer: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const result = await livepeer.generate.textToSpeech({
    modelId: "",
    text: "",
    description: "A male speaker delivers a slightly expressive and animated speech with a moderate speed and pitch.",
  });

  // Handle the result
  console.log(result);
}

run();

{
  "audio": {
    "url": "<string>"
  }
}

POST

text-to-speech

TypeScript

import { Livepeer } from "@livepeer/ai";

const livepeer = new Livepeer({
  httpBearer: "<YOUR_BEARER_TOKEN_HERE>",
});

async function run() {
  const result = await livepeer.generate.textToSpeech({
    modelId: "",
    text: "",
    description: "A male speaker delivers a slightly expressive and animated speech with a moderate speed and pitch.",
  });

  // Handle the result
  console.log(result);
}

run();

{
  "audio": {
    "url": "<string>"
  }
}

The default Gateway used in this guide is the public Livepeer.cloud Gateway. It is free to use but not intended for production-ready applications. For production-ready applications, consider using the Livepeer Studio Gateway, which requires an API token. Alternatively, you can set up your own Gateway node or partner with one via the ai-video channel on Discord.

Please note that the exact parameters, default values, and responses may vary between models. For more information on model-specific parameters, please refer to the respective model documentation available in the text-to-speech pipeline. Not all parameters might be available for a given model.

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

model_id

string

default:""

required

Hugging Face model ID used for text to speech generation.

text

string

default:""

Text input for speech generation.

description

string

default:A male speaker delivers a slightly expressive and animated speech with a moderate speed and pitch.

Description of speaker to steer text to speech generation.

Response

Successful Response

Response model for audio generation.

audio

MediaURL · object

required

The generated audio.

Show child attributes

Text To Image Upscale

⌘I

AI Video

Authorizations

Body

Response