- Video transcoding,
- AI inference (batch and real-time), or
- Dual (both pipelines on a single node).
Gateway Role
This explains how pipelines work from the gateway operator’s perspective. For orchestrator-side configuration (running AI workers, hosting models), see the Orchestrators section.
Gateway responsibilities
Accepts requests, matches orchestrator capabilities, enforces price and latency policy, handles retries and failover, returns outputs to the client.
Orchestrator responsibilities
Runs GPU inference or transcoding, hosts model weights, executes compute, returns results to the gateway.
Node Types
Video AI Dual Livepeer gateways route four categories of work. Each has different ingest patterns, payment models, and orchestrator requirements. A Dual node runs both video and AI pipelines on a single node.Video Transcoding
The gateway ingests a live or recorded video stream via RTMP or HTTP, segments it, and distributes transcoding work to orchestrators. Orchestrators return multiple encoded renditions, which the gateway assembles for HLS delivery. On-chain video gateways use the Livepeer probabilistic micropayment (PM) system: each segment carries a payment ticket redeemed on Arbitrum One. An ETH deposit and reserve balance on the TicketBroker contract are required. Ports: RTMP ingest on:1935, HTTP ingest and API on :8935, CLI on :5935.
Batch AI Inference
The gateway accepts HTTP requests for AI pipelines (text-to-image, audio-to-text, LLM, and others), routes each request to an orchestrator advertising the requested pipeline and model, and returns the inference result. This is a request/response pattern - the gateway sends a request and waits for the result. Off-chain AI gateways require no ETH deposit. The gateway targets orchestrators directly via-orchAddr. For on-chain AI (dual node type), the PM system applies.
Port: HTTP API on :8935 (set via -httpAddr, enabled by -httpIngest).
See for all supported batch AI pipeline types and model architectures.
Real-time AI
Real-time AI processes live video streams frame-by-frame through AI models with strict latency targets. Unlike batch inference (request/response), real-time AI maintains a persistent stream connection - frames flow in continuously and transformed frames flow out. The primary framework is ComfyStream, an open-source ComfyUI plugin that enables developers to build real-time AI video workflows (style transfer, avatars, live effects, real-time agents). Daydream is the hosted reference implementation - developers can use it without running their own gateway. Real-time AI runs on AI or Dual nodes. It uses thelive-video-to-video pipeline type and the trickle streaming protocol instead of the REST AI Jobs API. Billing is per-second, not per-pixel.
Port: HTTP API on :8935 (set via -httpAddr, enabled by -httpIngest).
ComfyStream
Build real-time AI video workflows with ComfyUI nodes.
Daydream
Hosted real-time AI video - no gateway required.
BYOC Pipelines
BYOC (Bring Your Own Container) allows any workload that can be containerised to run on Livepeer orchestrators. The gateway routes by capability descriptor (image-to-image, depth, segmentation) instead of by model name.
BYOC supports both GPU and CPU containers. GPU workloads (diffusion models, vision models, video-to-video) are the primary use case, but CPU-only deployments are also supported - see the BYOC CPU tutorial for a passthrough pipeline example.
What fits well: frame-based or stream-based workloads, custom ML models, enterprise-specific processing, novel applications (e.g. Embody AI avatars).
What does not fit: long-running training or fine-tuning jobs, workloads requiring large persistent state, high-latency multi-minute jobs. The network assumes short, stateless, repeatable units of work. Containers that maintain long-lived state between requests break retry and failover semantics.
BYOC requires a gateway - applications cannot interact with orchestrators directly. The gateway handles discovery, capability matching, and payment; the orchestrator runs your container.
Pipeline Matrix
Not all pipeline types are available on every node type. Use this table to confirm which pipelines apply to your setup.Orchestrator Discovery
How your gateway finds orchestrators depends on your operational mode and business model. There is no single discovery pattern - most production gateways use direct relationships instead of automatic pooled discovery.On-chain discovery (automatic)
On-chain discovery (automatic)
On-chain gateways query the Livepeer subgraph on Arbitrum for registered, active orchestrators. The gateway refreshes this list periodically and selects orchestrators based on capability, price, latency, and stake weight.This is the default for on-chain gateways and requires no manual orchestrator configuration. Best suited for public gateways routing to the open network pool.
Direct configuration (-orchAddr)
Direct configuration (-orchAddr)
Off-chain gateways (and some on-chain gateways) specify orchestrator addresses directly using the
-orchAddr flag. This bypasses on-chain discovery entirely.Most production gateways use this pattern. Operators build relationships with specific orchestrators who run the models and capabilities their applications need. This gives predictable performance, pricing, and availability.Webhook discovery
Webhook discovery
Gateways can call an external service via
-orchWebhookUrl to receive a dynamic orchestrator list. This enables custom filtering, whitelisting, or load balancing without modifying the gateway itself.Used by platform builders (NaaP) and operators with orchestrator tiering or geographic routing requirements.Own orchestrators
Own orchestrators
Some operators run their own orchestrators alongside their gateway. The gateway points to these dedicated orchestrators via
-orchAddr. This provides full control over compute, models, and pricing.Common for enterprise integrators, content providers with SLA requirements, and operators running custom BYOC or real-time AI workloads.Automatic on-chain discovery selects from the public orchestrator pool based on capability, price, latency, and performance history. For AI workloads - especially real-time AI and BYOC - direct orchestrator relationships are more common because applications need specific models, GPU configurations, or custom containers that not all orchestrators provide.
Related Resources
Video Transcoding Pipeline
How video jobs flow through your gateway - ingest, segmentation, orchestrator selection, and payment.
AI Inference Pipeline
How AI inference requests are routed - orchestrator discovery, model matching, pipeline types, and platform limits.
BYOC Pipelines
Routing custom container workloads by capability - operator responsibilities, model fit, and health tracking.
Pipeline Configuration
Transcoding profiles, AI routing flags, and per-pipeline tuning.
Workload Fit
Decision framework for evaluating whether your AI workload belongs on Livepeer.
Model Support
Full compatibility matrix - supported pipeline types, model architectures, and VRAM requirements.