POST
/
llm

We are currently deploying the Large Language Model (LLM) pipeline to our gateway infrastructure. This warning will be removed once all listed gateways have successfully transitioned to serving the LLM pipeline, ensuring a seamless and enhanced user experience.

The LLM pipeline supports streaming response by setting stream=true in the request. The response is then streamed with Server Sent Events (SSE) in chunks as the tokens are generated.

Each streaming response chunk will have the following format:

data: {"chunk": "word "}

The final chunk of the response will be indicated by the following format:

data: {"chunk": "[DONE]", "tokens_used": 256, "done": true}

The Response type below is for non-streaming responses that will return all of the response in one

The default Gateway used in this guide is the public Livepeer.cloud Gateway. It is free to use but not intended for production-ready applications. For production-ready applications, consider using the Livepeer Studio Gateway, which requires an API token. Alternatively, you can set up your own Gateway node or partner with one via the ai-video channel on Discord.

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

multipart/form-data
prompt
string
required
model_id
string
default: required
system_msg
string
default:
temperature
number
default: 0.7
max_tokens
integer
default: 256
history
string
default: []
stream
boolean
default: false

Response

200 - application/json
response
string
required
tokens_used
integer
required