Is My AI Workload a Good Fit for Livepeer?

Decision tree

Livepeer is optimised for streaming, GPU-bound, low-latency inference. It is not a general-purpose batch compute or file-processing network. Use this page to determine whether your workload is a good fit before you start building.

Start
 │
 ├── Is the workload STREAMING (frames / chunks / segments)?
 │    └── No  →  ✗ Not a good Livepeer fit
 │
 └── Yes
      │
      ├── Does the workload require GPU-accelerated INFERENCE?
      │    └── No  →  ✗ Use a gateway or standard compute
      │
      └── Yes
           │
           ├── Is LOW LATENCY (< ~500ms) important to the UX?
           │    └── No  →  ⚠ Possible, but not differentiated
           │
           └── Yes
                │
                ├── Does it produce INCREMENTAL output?
                │    └── No  →  ⚠ Marginal fit
                │
                └── Yes  →  ✓ Excellent Livepeer workload

Summary: Livepeer works best for streaming, GPU-bound inference with low latency and incremental output. If your workload fails the first two gates, don’t build it on Livepeer.

Capability matrix

Gateway vs orchestrator responsibilities by workload

Understanding the split between gateway and orchestrator helps you know where to direct integration effort for each workload type.

Audio workloads (ASR, translation, intent)

Gateway handles: audio ingestion via WebRTC, chunking and buffering, authentication and retries, output aggregation and fan-out. Orchestrator handles: GPU-resident ASR / translation models, streaming inference execution, incremental token emission, language or model specialisation.

Vision workloads (depth, pose, segmentation)

Gateway handles: frame routing, capability selection, latency monitoring, cost-aware routing. Orchestrator handles: vision model execution, GPU memory optimisation, per-frame inference, optional batching.

Video workloads (generation, effects, diffusion)

Gateway handles: stream orchestration, QoS and failover, output delivery, session lifecycle management. Orchestrator handles: persistent GPU pipelines, multi-model composition, frame-by-frame generation, real-time conditioning.

Text workloads (real-time only)

Gateway handles: request multiplexing, rate limiting, stable API surface. Orchestrator handles: lightweight LLMs or classifiers, prompt routing and control logic, real-time response generation.

ASR pipeline examples

These are some of the best-fit workloads on Livepeer today.

Live captions for video streams

Mic / Video Audio
      ↓
Gateway (WebRTC audio chunks)
      ↓
Orchestrator (GPU ASR model)
      ↓
Incremental text tokens
      ↓
Gateway → captions / overlays / APIs

Why it fits: continuous audio stream, warm GPU state, incremental output, latency-critical UX.

Multilingual live translation

Live Audio
      ↓
ASR
      ↓
Translation model
      ↓
Translated captions (real-time)

Why it fits: chained streaming inference, strong latency requirements, high differentiation vs batch pipelines.

Voice-driven avatars or agents

Live Audio
      ↓
ASR
      ↓
Intent / command extraction
      ↓
Video or avatar pipeline conditioning

Why it fits: multimodal real-time control loop, audio conditions downstream video.

Live moderation and safety

Live Audio
      ↓
ASR
      ↓
Keyword / sentiment / policy model
      ↓
Flags, triggers, overlays

Why it fits: streaming classification, immediate downstream actions.

What about batch and file-based workloads?

File-to-file and batch workloads: doable vs sensible

Livepeer will not block file-based or batch workloads. The protocol is general at the container level - anything that can run in a container can run on a Livepeer orchestrator. But Livepeer’s economics, routing, and reliability are tuned for streaming inference, not batch conversion.The precise rule:

File-to-file is usually a bad fit - unless the conversion is actually streaming inference in disguise. Livepeer cares about execution shape, not inputs.

Your examples, explicitly:YouTube video → MP3 Doable. Bad idea. CPU-bound, no inference, long-running batch job, wastes GPU slots, will be deprioritised by gateways. Technically works. Economically irrational.English → other language (translation)

File-to-file (text in → text out): batch job, latency-tolerant - weak Livepeer fit.
Live translation (speech or captions): audio arrives incrementally, translation emitted incrementally, latency matters - excellent Livepeer fit. Same model, different execution shape.

MP3 → text transcription

Upload MP3 → wait → download transcript: marginal. Works, but batch infra is cheaper and gateways gain little from routing it.
Streamed transcription (even from an MP3): chunk audio, emit tokens continuously, treat it like live audio - strong fit.

Reframe your mental model:Stop thinking: “file → file = bad.” Start thinking: “batch execution vs stream execution.”

Task	Batch	Streaming
MP3 → text	⚠ weak	✓ strong
Translation	⚠ weak	✓ strong
Video → audio	✗ weak	✗ still weak
ASR	⚠ okay	✓ excellent

The key constraint:Livepeer’s bottleneck is GPU opportunity cost, not capability. If a job occupies a GPU for a long time without benefiting from low latency, it will lose out to workloads that do. This is by design - gateways will naturally route away from poor-fit workloads.Safe summary:Many batch and file-based AI workloads are technically runnable on Livepeer. However, Livepeer is economically and operationally optimised for streaming, low-latency inference, and such workloads will be routed and priced accordingly.

Next steps

AI Pipelines

ComfyStream and BYOC - how to build and deploy inference pipelines.

BYOC

Bring your own container: run custom models on the network.

Model support

Full compatibility matrix - which model families run on Livepeer.

Start Here

Concepts

Get Started

Custom AI Workflows

Guides

Resources

Is My AI Workload a Good Fit for Livepeer?

Decision tree

Capability matrix

Gateway vs orchestrator responsibilities by workload

Audio workloads (ASR, translation, intent)

Vision workloads (depth, pose, segmentation)

Video workloads (generation, effects, diffusion)

Text workloads (real-time only)

ASR pipeline examples

Live captions for video streams

Multilingual live translation

Voice-driven avatars or agents

Live moderation and safety

What about batch and file-based workloads?

Next steps

AI Pipelines

BYOC

Model support

Start Here

Concepts

Get Started

Custom AI Workflows

Guides

Resources

​Decision tree

​Capability matrix

​Gateway vs orchestrator responsibilities by workload

​Audio workloads (ASR, translation, intent)

​Vision workloads (depth, pose, segmentation)

​Video workloads (generation, effects, diffusion)

​Text workloads (real-time only)

​ASR pipeline examples

​Live captions for video streams

​Multilingual live translation

​Voice-driven avatars or agents

​Live moderation and safety

​What about batch and file-based workloads?

​Next steps

AI Pipelines

BYOC

Model support

Decision tree

Capability matrix

Gateway vs orchestrator responsibilities by workload

Audio workloads (ASR, translation, intent)

Vision workloads (depth, pose, segmentation)

Video workloads (generation, effects, diffusion)

Text workloads (real-time only)

ASR pipeline examples

Live captions for video streams

Multilingual live translation

Voice-driven avatars or agents

Live moderation and safety

What about batch and file-based workloads?

Next steps