Skip to main content
Page is under construction.

Check the github issues for ways to contribute! Or provide your feedback in this quick form
Livepeer is a decentralised network of GPU nodes that run AI inference on video and streaming workloads. It is not a generic cloud GPU service and not a model marketplace — it is a real-time, streaming-first AI compute layer optimised for low-latency inference at the frame and segment level. This section is for developers who want to build applications that consume AI inference via Livepeer: style transfer, depth estimation, object detection, speech-to-text, image-to-image pipelines, and more.

How it works

Your application sends inference requests to a Gateway. The gateway discovers available Orchestrators, routes your job based on capability, price, and latency, handles retries and auth, and returns results. You never communicate with an orchestrator directly — the gateway handles all of that.
LayerWhat it doesWho runs it

What you can build

Livepeer AI is designed for streaming and real-time workloads. Strong fits include:

Real-time video effects

Style transfer, background replacement, depth overlays, and image-to-image pipelines running frame-by-frame on live video.

Live speech & captions

Live ASR, real-time translation, and caption generation from audio chunks ingested via WebRTC.

Vision pipelines

Object detection, pose estimation, face parsing, segmentation — per-frame GPU inference for live streams.

Custom AI pipelines

Composable multi-step inference workflows via ComfyStream or BYOC: vision → conditioning → generation in sequence.

AI on Livepeer vs other infrastructure

Livepeer AIGeneric GPU cloudHosted AI APIs

How models get on the network

Models run inside Orchestrator nodes. Orchestrators can use:
  • ComfyUI — the most common approach; load .safetensors weights, build inference DAGs, serve via ComfyStream
  • Custom inference servers — any Torch / TensorRT / ONNX model wrapped in a Docker container (BYOC)
Orchestrators advertise capabilities — image-to-image, depth, style-transfer — not model names. Gateways route by capability and performance, not by which specific model weights are loaded. This means models can be swapped or improved without breaking your application.

How orchestrators host models

Step-by-step guide to running AI models on an orchestrator node

Start here


Last modified on March 2, 2026