GPU Vendor Support
NVIDIA is the supported vendor line for go-livepeer. AMD and Intel GPUs are not supported for hardware-accelerated transcoding or AI inference, and NVIDIA CUDA is required for both NVENC video transcoding and AI pipelines.Driver requirements
Start here, because every other requirement is irrelevant if the GPU is not visible to the host:nvidia-smi fails or is missing, install NVIDIA drivers for the target OS before proceeding.
go-livepeer cannot use the GPU without working NVIDIA drivers.
Minimum CUDA version: 12.0. Earlier versions are not supported by the AI runner containers.
NVENC/NVDEC Session Limits
NVIDIA consumer GPUs enforce concurrent encoding session limits via NVENC (hardware video encoder) and NVDEC (hardware video decoder). For many video-first operators, this is the first hard capacity ceiling that matters. Consumer cards are typically limited to 2-3 concurrent NVENC encode sessions before throttling or failing withNvEncOpenEncodeSessionEx errors. This is a hardware-enforced licensing restriction,
not a software limitation in go-livepeer.
Check the official NVIDIA matrix for exact limits per generation:
developer.nvidia.com/video-encode-decode-gpu-support-matrix
Exact session counts can vary by GPU generation and NVIDIA policy. Treat the NVIDIA matrix as the
source of truth and the table below as a planning shortcut. For most consumer-card operators, this
cap is the binding limit until benchmarking proves otherwise.
Hardware Tiers by Workload
Video transcoding
Video transcoding is primarily a throughput problem. You care about encoder sessions, sustained throughput, and stability far more than you care about large VRAM pools. CPU transcoding: Possible without a GPU, but throughput is significantly lower. Practical capacity is low single-digit concurrent sessions, varies heavily by CPU generation and codec mix. Not competitive at scale.Batch AI inference
AI inference flips the constraint. The first question is whether the model fits in memory at all. Full pipeline-by-pipeline VRAM figures are in .Cascade AI inference
Cascade video AI is a latency problem before it is a raw-throughput problem, so the hardware requirements are stricter than they are for batch AI:- Minimum VRAM: 12-16 GB (StreamDiffusion with SD 1.5)
- Competitive VRAM: 24 GB (StreamDiffusion with SDXL)
- Frame buffer overhead: add 1-2 GB on top of the model’s base VRAM footprint
- Additional requirements: fast PCIe bandwidth; low-latency memory
System Requirements
Storage note: A single SDXL model is approximately 7-8 GB. Running multiple pipelines requires fast NVMe storage to load models without latency spikes. Budget at least 100 GB per 4-5 models served simultaneously. Network note: Latency to Gateways matters more than raw throughput for AI jobs. A 100 Mbps connection with sub-20ms latency to major regions outperforms a 1 Gbps connection with high latency for Cascade AI workloads.Testing Your Capacity
Do not set-maxSessions from guesswork or marketing specs. Run livepeer_bench to find the GPU’s
actual concurrent transcoding ceiling after hardware is confirmed and before activation.
livepeer_bench ships with go-livepeer. Verify it is on PATH:
bench-scale.sh
bench.log. Increase the range to {1..40}
if the ratio is still below 1.0 at 20 sessions.
Step 3 - Read the result:
The Duration Ratio is total transcoding time divided by total source duration.
The last concurrent session count at which ratio ≤ 0.8 is the practical hardware limit many
operators use. That threshold keeps some headroom for upload/download overhead and short spikes.
Take this number to and set it as the input for the -maxSessions calculation.
Checklist Before Going Live
Activate only after the checklist below passes. It covers the basics that fail first most often.Related Pages
Configure
Set
-maxSessions, -pricePerUnit, -nvidia, and all other go-livepeer flags before activation.Models and VRAM Reference
Full VRAM table by AI pipeline and model, warm strategy, and multi-GPU configuration.
Alternate Deployments
Pool worker, O-T split, and Siphon - the alternatives to the combined single-machine setup.
Operating Rationale
Cost and revenue breakdown before committing hardware.