Skip to main content
The gateway handles routing and payment negotiation. The orchestrator handles compute. Run both on one machine, off-chain, and watch a full inference request travel through both sides and return a result without a wallet or on-chain registration.

This tutorial runs a complete local AI inference pipeline: a gateway receives a client request, routes it to a local orchestrator, the orchestrator processes it through an AI runner container, and the result returns to the caller. Estimated time: 2 to 3 hours (most of this is model download time). What you will verify:
  • The gateway routes an inference request to the orchestrator
  • The orchestrator processes it through the AI runner
  • The response returns through the gateway to the caller
  • Each step is visible in the respective logs

Pipeline architecture

Client (curl)
      ↓ POST /text-to-image
Gateway (port 8936)
      ↓ routes job + PM ticket
Orchestrator (port 8935)
      ↓ dispatches to AI runner
AI runner container
      ↓ SDXL-Lightning inference on GPU
Orchestrator
      ↓ result + ticket evaluation
Gateway
      ↓ PNG response
Client
The gateway and orchestrator run as separate processes. In production, they run on separate machines. This tutorial runs both locally to make the log trace visible end-to-end.

Prerequisites

No ETH wallet, Arbitrum RPC, or on-chain registration required. This tutorial runs off-chain.

Step 1: Download the model

Download the model weights before starting either process. Both the orchestrator and AI runner need the weights present at startup.
mkdir -p ~/.lpData/models ~/.lpData-gateway
docker run --rm \
  -v ~/.lpData/models:/models \
  --gpus all \
  livepeer/ai-runner:latest \
  bash -c "PIPELINE=text-to-image MODEL_ID=ByteDance/SDXL-Lightning bash /app/dl_checkpoints.sh"
This downloads approximately 6 GB. Watch the download output and wait for completion. Verify:
ls -lh ~/.lpData/models/

Step 2: Write aiModels.json

cat > ~/.lpData/aiModels.json << 'EOF'
[
  {
    "pipeline": "text-to-image",
    "model_id": "ByteDance/SDXL-Lightning",
    "price_per_unit": 4768371,
    "warm": true
  }
]
EOF
price_per_unit sets the orchestrator’s sell-side price. The gateway’s buy-side cap must be at or above this value for the job to route. In Step 4 the gateway is started with no explicit price cap, so it accepts any price.

Step 3: Start the orchestrator

In a terminal, start the orchestrator in off-chain mode with the AI worker:
docker run -d \
  --name livepeer-orchestrator \
  -v ~/.lpData/:/root/.lpData/ \
  -v /var/run/docker.sock:/var/run/docker.sock \
  --network host \
  --gpus all \
  livepeer/go-livepeer:latest \
  -orchestrator \
  -transcoder \
  -nvidia 0 \
  -pricePerUnit 1000 \
  -serviceAddr 127.0.0.1:8935 \
  -cliAddr 127.0.0.1:7935 \
  -network offchain \
  -aiWorker \
  -aiModels /root/.lpData/aiModels.json \
  -aiModelsDir /root/.lpData/models
Wait for the warm model to load - this takes 2 to 5 minutes:
docker logs -f livepeer-orchestrator 2>&1 | grep -i "warm\|pipeline\|ai-runner\|error"
Expected:
Expected warm-model startup log
Starting AI worker
Pipeline text-to-image started
Warm model loaded: ByteDance/SDXL-Lightning
Verify the orchestrator is accepting connections locally:
curl http://localhost:7935/registeredOrchestrators
Expected: a JSON array with your orchestrator at 127.0.0.1:8935.

Step 4: Start the gateway

In a new terminal, start an off-chain AI gateway pointing at the local orchestrator. The community remote signer handles payment operations:
docker run -d \
  --name livepeer-gateway \
  -v ~/.lpData-gateway/:/root/.lpData/ \
  --network host \
  livepeer/go-livepeer:latest \
  -gateway \
  -cliAddr 127.0.0.1:7936 \
  -httpAddr 0.0.0.0:8936 \
  -orchAddr http://127.0.0.1:8935 \
  -httpIngest \
  -remoteSignerAddr https://signer.eliteencoder.net \
  -network offchain
Key flags:
  • -orchAddr http://127.0.0.1:8935 - points directly at the local orchestrator (off-chain mode bypasses on-chain discovery)
  • -httpIngest - enables the AI inference HTTP endpoints
  • -remoteSignerAddr - community remote signer for payment ticket signing (no wallet needed)
  • Separate -cliAddr and -httpAddr from the orchestrator’s ports (7936 and 8936 vs 7935 and 8935)
The remote signer at signer.eliteencoder.net is a community-hosted service for testing. Check availability in #local-gateways on Discord before you start.
Verify the gateway started:
docker logs livepeer-gateway 2>&1 | grep -i "started\|gateway\|signer\|orchestrator" | head -10
Expected:
Expected gateway startup log
Gateway started on :8936
Connected to remote signer at https://signer.eliteencoder.net
Registered orchestrator: 127.0.0.1:8935
Verify the gateway API is responding:
curl http://localhost:8936/health
Expected: {"status":"ok"}

Step 5: Send an inference request through the gateway

Send a text-to-image request through the gateway on port 8936. Keep port 8935 for the gateway-to-orchestrator hop:
curl -X POST http://localhost:8936/text-to-image \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "ByteDance/SDXL-Lightning",
    "prompt": "a coastal town in evening light, photorealistic",
    "width": 512,
    "height": 512,
    "num_inference_steps": 4
  }' \
  -o pipeline-output.png \
  --max-time 60
This request travels the full pipeline. A typical first inference takes 5 to 15 seconds (VRAM kernel warm-up on the first job). Subsequent requests take 2 to 4 seconds. Verify the output:
file pipeline-output.png
ls -lh pipeline-output.png
Expected: pipeline-output.png: PNG image data with a non-zero file size.

Step 6: Trace the request through logs

The request left footprints in each component. Read the logs to understand what happened at each hop: Gateway log - shows routing decision and payment signing:
docker logs livepeer-gateway 2>&1 | grep -i "route\|signer\|ticket\|orchestrat" | tail -10
Expected entries:
Expected gateway log entries
Routing job to orchestrator: 127.0.0.1:8935
Calling remote signer: getOrchInfoSig
Calling remote signer: signTicket
Orchestrator log - shows job receipt, dispatch to AI runner, and result:
docker logs livepeer-orchestrator 2>&1 | grep -i "job\|ai-runner\|inference\|ticket" | tail -10
Expected entries:
Expected orchestrator log entries
Received AI job: text-to-image
Dispatching to AI runner container
Inference complete
Ticket received
AI runner container log - shows inference execution:
docker ps --filter name=livepeer | grep -v "livepeer-orchestrator\|livepeer-gateway"
# Find the ai-runner container name, then:
docker logs <ai-runner-container> 2>&1 | tail -20
Expected entries include model inference steps and output dimensions.

What happened

The request completed the full Livepeer AI pipeline:
  1. The curl request hit the gateway at :8936 on the /text-to-image endpoint.
  2. The gateway selected the local orchestrator at :8935 (the only option via -orchAddr), signed a payment ticket using the community remote signer, and forwarded the job request.
  3. The orchestrator received the job, forwarded it to the AI runner container via Docker-out-of-Docker, and waited for the result.
  4. The AI runner loaded the SDXL-Lightning model from VRAM (it was pre-warmed), ran 4 diffusion steps, and returned a PNG.
  5. The orchestrator returned the result to the gateway and evaluated the payment ticket (in off-chain mode, settlement is handled by the remote signer instead of the Arbitrum TicketBroker).
  6. The gateway returned the PNG to the curl client.
In production, the orchestrator is registered on-chain and the gateway discovers it via the Livepeer protocol. Payment tickets settle on Arbitrum through the TicketBroker contract. The inference mechanics are identical.
Last modified on March 16, 2026