AI and Job Pipelines Overview

A Gateway’s node type determines which workloads it can route:

Video transcoding,
AI inference (batch and real-time), or
Dual (both pipelines on a single node).

Each pipeline is the control-plane path between an application request and Orchestrator execution.

Gateway Role

This explains how pipelines work from the Gateway operator’s perspective. For Orchestrator-side configuration (running AI workers, hosting models), see the Orchestrators section.

Gateway responsibilities

Accepts requests, matches Orchestrator capabilities, enforces price and latency policy, handles retries and failover, returns outputs to the client.

Orchestrator responsibilities

Runs GPU inference or transcoding, hosts model weights, executes compute, returns results to the Gateway.

Node Types

Video AI Dual Livepeer Gateways route four categories of work. Each has different ingest patterns, payment models, and Orchestrator requirements. A Dual node runs both video and AI pipelines on a single node.

Video Transcoding

The Gateway ingests a live or recorded video stream via RTMP or HTTP, segments it, and distributes transcoding work to Orchestrators. Orchestrators return multiple encoded renditions, which the Gateway assembles for HLS delivery. On-chain video Gateways use the Livepeer probabilistic micropayment (PM) system: each segment carries a payment ticket redeemed on Arbitrum One. An ETH deposit and reserve balance on the TicketBroker contract are required. Ports: RTMP ingest on :1935, HTTP ingest and API on :8935, CLI on :5935.

Batch AI Inference

The Gateway accepts HTTP requests for AI pipelines (text-to-image, audio-to-text, LLM, and others), routes each request to an Orchestrator advertising the requested pipeline and model, and returns the inference result. This is a request/response pattern - the Gateway sends a request and waits for the result. Off-chain AI Gateways require no ETH deposit. The Gateway targets Orchestrators directly via -orchAddr. For on-chain AI (dual node type), the PM system applies. Port: HTTP API on :8935 (set via -httpAddr, enabled by -httpIngest). See for all supported batch AI pipeline types and model architectures.

Real-time AI

Real-time AI processes live video streams frame-by-frame through AI models with strict latency targets. Unlike batch inference (request/response), real-time AI maintains a persistent stream connection - frames flow in continuously and transformed frames flow out. The primary framework is ComfyStream, an open-source ComfyUI plugin that enables developers to build real-time AI video workflows (style transfer, avatars, live effects, real-time agents). Daydream is the hosted reference implementation - developers can use it without running their own Gateway. Real-time AI runs on AI or Dual nodes. It uses the live-video-to-video pipeline type and the trickle streaming protocol instead of the REST AI Jobs API. Billing is per-second, not per-pixel. Port: HTTP API on :8935 (set via -httpAddr, enabled by -httpIngest).

ComfyStream

Build real-time AI video workflows with ComfyUI nodes.

Daydream

Hosted real-time AI video - no Gateway required.

BYOC Pipelines

BYOC (Bring Your Own Container) allows any workload that can be containerised to run on Livepeer Orchestrators. The Gateway routes by capability descriptor (image-to-image, depth, segmentation) instead of by model name. BYOC supports both GPU and CPU containers. GPU workloads (diffusion models, vision models, video-to-video) are the primary use case, but CPU-only deployments are also supported - see the BYOC CPU tutorial for a passthrough pipeline example. What fits well: frame-based or stream-based workloads, custom ML models, enterprise-specific processing, novel applications (e.g. Embody AI avatars). What does not fit: long-running training or fine-tuning jobs, workloads requiring large persistent state, high-latency multi-minute jobs. The network assumes short, stateless, repeatable units of work. Containers that maintain long-lived state between requests break retry and failover semantics.

BYOC requires a Gateway - applications cannot interact with Orchestrators directly. The Gateway handles discovery, capability matching, and payment; the Orchestrator runs your container.

BYOC follows the same payment model as AI inference. See to evaluate whether your workload belongs on Livepeer.

Pipeline Matrix

Not all pipeline types are available on every node type. Use this table to confirm which pipelines apply to your setup.

Running both video and AI workloads from a single node is possible with a Dual Gateway. The Gateway itself does not need a GPU - Orchestrators provide the compute. However, dual nodes manage two independent session managers simultaneously, which increases operational complexity.

Orchestrator Discovery

How your Gateway finds Orchestrators depends on your operational mode and business model. There is no single discovery pattern - most production Gateways use direct relationships instead of automatic pooled discovery.

On-chain discovery (automatic)

On-chain Gateways query the Livepeer subgraph on Arbitrum for registered, active Orchestrators. The Gateway refreshes this list periodically and selects Orchestrators based on capability, price, latency, and stake weight.This is the default for on-chain Gateways and requires no manual Orchestrator configuration. Best suited for public Gateways routing to the open network pool.

Direct configuration (-orchAddr)

Off-chain Gateways (and some on-chain Gateways) specify Orchestrator addresses directly using the -orchAddr flag. This bypasses on-chain discovery entirely.Most production Gateways use this pattern. Operators build relationships with specific Orchestrators who run the models and capabilities their applications need. This gives predictable performance, pricing, and availability.

Webhook discovery

Gateways can call an external service via -orchWebhookUrl to receive a dynamic Orchestrator list. This enables custom filtering, whitelisting, or load balancing without modifying the Gateway itself.Used by platform builders (NaaP) and operators with Orchestrator tiering or geographic routing requirements.

Own orchestrators

Some operators run their own Orchestrators alongside their Gateway. The Gateway points to these dedicated Orchestrators via -orchAddr. This provides full control over compute, models, and pricing.Common for enterprise integrators, content providers with SLA requirements, and operators running custom BYOC or real-time AI workloads.

Automatic on-chain discovery selects from the public Orchestrator pool based on capability, price, latency, and performance history. For AI workloads - especially real-time AI and BYOC - direct Orchestrator relationships are more common because applications need specific models, GPU configurations, or custom containers that not all Orchestrators provide.

Video Transcoding Pipeline

How video jobs flow through your Gateway - ingest, segmentation, Orchestrator selection, and payment.

AI Inference Pipeline

How AI inference requests are routed - Orchestrator discovery, model matching, pipeline types, and platform limits.

BYOC Pipelines

Routing custom container workloads by capability - operator responsibilities, model fit, and health tracking.

Pipeline Configuration

Transcoding profiles, AI routing flags, and per-pipeline tuning.

Workload Fit

Decision framework for evaluating whether your AI workload belongs on Livepeer.

Model Support

Full compatibility matrix - supported pipeline types, model architectures, and VRAM requirements.

​Gateway Role

Gateway responsibilities

Orchestrator responsibilities

​Node Types

​Video Transcoding

​Batch AI Inference

​Real-time AI

ComfyStream

Daydream

​BYOC Pipelines

​Pipeline Matrix

​Orchestrator Discovery

​Related Resources

Video Transcoding Pipeline

AI Inference Pipeline

BYOC Pipelines

Pipeline Configuration

Workload Fit

Model Support

Gateway Role

Node Types

Video Transcoding

Batch AI Inference

Real-time AI

BYOC Pipelines

Pipeline Matrix

Orchestrator Discovery

Related Resources