} ; }; export const CustomDivider = ({color = "var(--lp-color-border-default)", middleText = "", spacing = "default", style = {}, className = "", ...rest}) => { const spacingPresets = { default: { margin: "24px 0" }, overlap: { margin: "-1rem 0 -1rem 0" }, tight: { margin: "0 0 -1rem 0" }, section: { margin: "0 0 -2rem 0" }, sectionOverlap: { margin: "-1rem 0 -2rem 0" }, deepOverlap: { margin: "-1rem 0 -1.5rem 0" } }; const spacingStyle = spacingPresets[spacing] || spacingPresets.default; return

{middleText && <> {middleText} }

; }; BYOC (Bring Your Own Container) extends what AI (and other) workloads a Gateway can route by allowing Orchestrators to run custom Docker inference containers and advertise them as capabilities. BYOC is a routing and policy concern, not a model hosting concern. Gateways configure how requests reach BYOC-capable Orchestrators; the Orchestrators handle everything inside the container. the Gateway operator perspective on BYOC. For the Orchestrator and developer side - building containers, registering capabilities, and deploying inference servers - see the . ## BYOC Routing BYOC Orchestrators advertise custom capabilities using the same protocol as standard AI Orchestrators. The difference is that instead of running a managed `ai-runner` container for a known pipeline (such as `text-to-image`), they run a custom Docker container that exposes any inference API they choose. Gateway routes to them using the same `-orchAddr` flag and the same `AISessionManager` used for standard AI pipelines. The distinction is in how you think about the routing contract. ```mermaid theme={"theme":{"light":"github-light","dark":"dark-plus"}} %%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#18794E', 'primaryTextColor': '#fff', 'primaryBorderColor': '#3CB540', 'lineColor': '#3CB540', 'mainBkg': '#18794E', 'nodeBorder': '#3CB540', 'clusterBkg': 'transparent', 'clusterBorder': '#3CB540', 'titleColor': '#3CB540', 'edgeLabelBackground': 'transparent', 'textColor': '#3CB540', 'nodeTextColor': '#fff'}}}%% flowchart LR A["Client App"] --> B["Gateway"] B --> C["Capability match\n+ service policy"] C --> D["Orchestrator\n(BYOC container)"] D --> E["Custom inference\nresult"] E --> D D --> B B --> A classDef default fill:#1a1a1a,color:#fff,stroke:#2d9a67,stroke-width:2px ``` ## Gateway Responsibilities Route requests by capability and service policy. Monitor per-capability health and error rates. Configure retry policy for BYOC-specific failure modes (cold starts, model loading). Set price ceilings per capability. Maintain failover to alternative Orchestrators when a BYOC node degrades. Run model containers. Host model weights. Expose Orchestrator-internal model identifiers as public API contracts. Manage GPU allocation. Control what runs inside the BYOC container. Prioritise real-time, GPU-bound, frame-based capabilities when selecting BYOC Orchestrators to connect to.

Poor-fit batch workloads (large LLMs, multi-minute jobs, stateful pipelines) behind BYOC will degrade routing quality and increase latency for all jobs on the same Orchestrator. ## Capability Contracts BYOC routing treats capabilities as stable API contracts, not model names. Orchestrators advertise capability descriptors (`image-to-image`, `depth`, `segmentation`, `style-transfer`). Your Gateway routes on the capability - the Orchestrator decides which model or container implementation serves it. This means Orchestrators can update models without breaking your routing, multiple Orchestrators can compete to serve the same capability, and performance-based routing automatically favours faster or cheaper implementations. **Anti-pattern:** Coupling your routing to model names. If you are making routing decisions based on `SG161222/RealVisXL_V4.0_Lightning` instead of `image-to-image`, you are working against the architecture. ## Routing Profiles BYOC capabilities on the network fall into broad latency profiles. Understanding these helps you set appropriate retry timeouts and price ceilings. Capabilities like `style-transfer`, `image-to-image`, and `video-to-video` where Orchestrators keep models warm with persistent GPU residency. Expect sub-second latency per frame. **Routing guidance:** low retry timeout (5-10s), stable latency - high variance indicates GPU contention. Low price ceiling acceptable. Capabilities like `depth`, `segmentation`, and `pose` - fast per frame (milliseconds) but may have a cold-start cost on first request. Once warm, throughput is high. **Routing guidance:** slightly higher retry timeout to account for cold starts. Very low per-request cost relative to diffusion capabilities. Monitor for latency spikes indicating model eviction. Orchestrators that chain multiple capabilities in sequence (e.g. Depth estimation feeding into a diffusion step). Higher latency and VRAM than either capability alone. **Routing guidance:** higher latency per request, price ceiling must account for combined compute. Expect higher per-pixel rates. BYOC works best for frame-based, stream-based, and short-lived GPU workloads. See for the full decision framework and for how BYOC fits the pipeline taxonomy. ## BYOC Requirements These constraints apply to BYOC containers on the network. Gateway operators enforce them through routing priority; developers must meet them for their containers to be routable. The network assumes short, repeatable, stateless units of work. BYOC containers that maintain long-lived state between requests break retry and failover semantics. If a request fails and the Gateway retries on a different Orchestrator, a stateful container will produce inconsistent results. Containers that take more than 10 seconds to serve their first inference will be deprioritised by Gateways tracking per-Orchestrator latency. Prefer Orchestrators that keep models warm. When evaluating a new BYOC Orchestrator, send a test request and measure cold-start latency before committing to high-traffic routing. Containers that use excessive VRAM reduce the Orchestrator's ability to serve concurrent jobs. For real-time pipelines, prefer fp16 or quantised models - they use less VRAM with minimal quality loss. The capability descriptor advertised must match what the container can actually serve. Mismatches cause routing failures and degrade the Orchestrator's reputation score. The container must expose an HTTP endpoint implementing the Livepeer AI worker API. See the for the full API contract and container requirements. ## Health Tracking BYOC routing requires per-capability health tracking instead of per-Orchestrator tracking. An Orchestrator may serve `image-to-image` perfectly while its `depth` capability is degraded. Track them independently. BYOC containers may have longer cold-start times than standard `ai-runner` pipelines. See for retry timeout settings (`-aiProcessingRetryTimeout`). **Failure modes specific to BYOC:** * Cold-start delays: container loading model from disk for the first time * GPU out-of-memory: container allocated too much VRAM, evicting other models * Container crash: Docker container exited, Orchestrator not yet restarting it * API mismatch: container endpoint returns unexpected schema For each failure mode, the `AISessionManager` will attempt to route to an alternative Orchestrator. If no alternative is available, the request fails with an error returned to the client. ## Capability Discovery BYOC capability discovery uses the same mechanisms as standard AI discovery. See and for the full list of discovery methods and tools. When you identify a BYOC-capable Orchestrator, add them to your `-orchAddr` list. The `AISessionManager` will route BYOC requests to them when their advertised capability matches the request. ## Related Pages Pipeline taxonomy - video, batch AI, real-time AI, and BYOC. Standard AI pipeline routing, Orchestrator discovery, and AISessionManager details. Retry timeouts, AI routing flags, and per-capability price ceiling configuration. Decision framework for evaluating whether your AI workload belongs on Livepeer. Full architecture, container requirements, and setup for teams building BYOC containers. Per-capability health tracking, discovery error metrics, and alert configuration. {/* PURPOSE: Journey step: "Custom containers on the network" Gateway-operator guide for BYOC (Bring Your Own Container) routing and service policy. Focuses on what the gateway operator needs to know and do - NOT how to build BYOC containers (that's developer/orchestrator territory). Key framing: capabilities as API contracts, not model names. The gateway routes by capability; the orchestrator runs the container. SECTION HOME: Guides → AI and Job Pipelines JOURNEY POSITION: 1. Pipeline Overview - "What workloads can my gateway route?" 2. Video Transcoding Pipeline - "How do video jobs flow?" 3. AI Inference Pipeline - "How do AI jobs flow?" 4. BYOC Pipelines (this page) - "Custom containers on the network" 5. Pipeline Configuration - "Configure transcoding profiles and AI routing" RELATED FILES (draw from): - all-resources/v2-guidesres--byoc.mdx - PRIMARY (95%): 68 lines. Gateway-operator BYOC guide. Routing by capability/policy, health monitoring, retry config. Capability-as-contract framing. - all-resources/v2-dev--ai-pipelines-byoc.mdx - PRIMARY (80%): 256 lines. Detailed BYOC implementation: what BYOC is/isn't, model fit matrix, capability routing philosophy, 3 implementation patterns (real-time diffusion, vision utility, etc.). - all-resources/v2-dev--ai-pipelines-model-support.mdx - SECONDARY (60%): 250 lines. Model compatibility matrix: diffusion, control/conditioning, vision models. Three-tier ratings (green/yellow/red). - all-resources/v2-dev--ai-pipelines-overview.mdx - SECONDARY (30%): 262 lines. BYOC as one of 3 integration patterns. Context for where BYOC fits. CROSS-REFS: - AI Inference Pipeline (this section) - BYOC is a pipeline type within AI inference - Advanced Operations → Gateway Middleware - middleware can add custom routing for BYOC - Setup → AI Configuration - base AI config that BYOC builds on - Resources → AI API Reference - endpoints used for BYOC requests */}