Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.livepeer.org/llms.txt

Use this file to discover all available pages before exploring further.

A BYOC container that works locally may fail in production under concurrent sessions, GPU memory pressure, or ungraceful restarts. Use this checklist to verify container behaviour before registering on mainnet.

GPU memory profiling

Profile your container under the expected concurrent session count:
# Monitor GPU memory during load test
watch -n 1 nvidia-smi

# Run multiple concurrent sessions against local orchestrator
for i in $(seq 1 5); do
  curl -X POST http://localhost:8935/live-video-to-video -d '{"model_id":"my-model"}' &
done
Measure peak VRAM usage per session and multiply by expected concurrency. If peak exceeds your GPU’s VRAM, either reduce per-session memory (smaller batch size, lower resolution) or limit the orchestrator’s maxSessions configuration.

Graceful shutdown

The orchestrator sends SIGTERM when stopping a container. Handle it:
import signal
import asyncio

async def shutdown(server):
    # Close active sessions
    await server.close_all_sessions()
    # Flush any buffered output
    await server.flush()

def handle_sigterm(signum, frame):
    asyncio.get_event_loop().create_task(shutdown(server))

signal.signal(signal.SIGTERM, handle_sigterm)
A container that does not handle SIGTERM is killed after a timeout (default 10 seconds). Active sessions receive no graceful close and may produce incomplete output.

Health check under load

The /health endpoint must return {"status": "ok"} even under full GPU load. If health checks fail, the orchestrator stops advertising the capability and gateways route elsewhere. Common failure: the health check handler shares the GPU inference thread and blocks during heavy processing. Run health checks on a separate thread or async task.

Monitoring

Expose Prometheus metrics from your container for the orchestrator’s monitoring stack:
MetricDescription
byoc_sessions_activeCurrent concurrent sessions
byoc_frame_latency_msPer-frame processing latency histogram
byoc_gpu_memory_bytesCurrent GPU memory usage
byoc_errors_totalProcessing errors by type
The orchestrator’s Prometheus scraper picks up metrics from containers on the same Docker network. The BYOC architecture covers the container interface. The production checklist covers gateway-side production requirements.
Last modified on May 19, 2026