Skip to main content
The Gateway’s default Orchestrator selection works out of the box: it discovers available Orchestrators, filters by price, and routes jobs.
For production workloads, operators often need more control. This guide covers manual selection, quality tiering, and failover configuration.

Selection Algorithm

Before tuning selection, it helps to understand what the Gateway does by default. The scoring algorithm is a weighted combination of four factors, each adjustable via flags: For AI Gateways using an explicit Orchestrator list (-orchAddr), the selection algorithm is simpler: the Gateway round-robins across the listed Orchestrators while respecting capability and price filters.

Workload Criteria

Different workloads have different priorities:

Orchestrator Settings

Tiering Strategy

Operators running a gateway-as-service with SLA commitments route different customer tiers to different Orchestrator quality levels. Since go-livepeer does not natively support named tiers, the pattern is to run separate Gateway instances per tier, each with a different configuration.
Tier 1 (premium) -> Gateway instance A
    -orchAddr <premium-orch-list>
    -maxPricePerUnit 5000         # Higher cap = access to better Orchestrators
    -minPerfScore 0.95

Tier 2 (standard) -> Gateway instance B
    -orchAddr <standard-orch-list>
    -maxPricePerUnit 2000
    -minPerfScore 0.8

Tier 3 (budget) -> Gateway instance C
    -maxPricePerUnit 500
    # No minPerfScore - accepts wider range
The middleware layer (see ) routes incoming requests to the correct Gateway instance based on the customer’s tier.

Failover Behaviour

When an Orchestrator fails mid-job, the Gateway automatically swaps to the next candidate in its selection pool. For video transcoding, the stream continues on a new Orchestrator with segments re-attempted. For AI inference, the request is retried on a different Orchestrator.The number of retry attempts before the job fails is controlled by -maxAttempts (video transcoding). For AI jobs, the retry behaviour depends on the pipeline.
High Orchestrator swap rates indicate instability in the Orchestrator pool, such as under-resourced machines or network issues. To reduce swaps:
  • Increase -minPerfScore to exclude poorly-performing Orchestrators proactively
  • Add -orchMinLivepeerVersion to exclude outdated nodes
  • Review -maxPricePerUnit or -maxPricePerCapability ceiling: if it is too low, the Gateway may be cycling through marginal Orchestrators
Monitor the livepeer_orchestrator_swaps Prometheus counter to track swap frequency over time.
If the Gateway cannot find a suitable Orchestrator within the discovery window, the job fails. The timeout is configurable:
-discoveryTimeout 10s
Increase this if jobs fail because no Orchestrator was found, particularly at startup or after a blocklist change. Decrease it for faster failure detection in latency-sensitive applications.

AI Capability Matching

For AI Gateways, Orchestrator selection is driven by capability matching before any price or performance scoring applies. The Gateway only considers Orchestrators that declare support for the requested pipeline and model. Capability information is returned by /getNetworkCapabilities. To inspect which Orchestrators support a specific model:
/getNetworkCapabilities
curl http://localhost:8935/getNetworkCapabilities | \
  jq '.orchestrators[] | select(.capabilities_prices[].model_id == "black-forest-labs/FLUX.1-dev")'
Set per-capability price caps to control costs per pipeline:
-maxPricePerCapability
-maxPricePerCapability '[{"pipeline":"text-to-image","model_id":"black-forest-labs/FLUX.1-dev","price_per_unit":100,"pixels_per_unit":1000000}]'
If no Orchestrator in the pool meets both the capability requirement and the price cap, the job fails. To relax the price cap instead of fail the job:
-ignoreMaxPriceIfNeeded
-ignoreMaxPriceIfNeeded

Monitoring Selection

Track selection health in production:
Last modified on March 16, 2026