> ## Documentation Index
> Fetch the complete documentation index at: https://docs.livepeer.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Add AI to Your Node

> An Experimental AI Prompt for setting up your GPU as a Livepeer Orchestrator. Add AI inference pipelines to an existing go-livepeer transcoding node – hardware check, aiModels.json configuration, and startup command update.

By the end of this guide, your Orchestrator will accept AI inference jobs alongside transcoding.

## Prerequisites

Before you begin:

* go-livepeer is installed and running as a transcoding Orchestrator on Arbitrum mainnet (see [Install go-livepeer](/v2/Orchestrators/setup/rs-install) and [Get Started](/v2/Orchestrators/quickstart/guide)) {/* REVIEW: confirm target paths */}
* Your Orchestrator is in the Top 100 Active Set on the Livepeer Network
* Docker is installed with `nvidia-container-toolkit` enabled (GPU passthrough required for the AI Runner containers)
* Your GPU has at least **4GB of VRAM** available to run at least one AI pipeline (see the hardware check below)
* Model weights pre-downloaded for the pipeline(s) you want to serve (see [Download AI Models](/v2/Orchestrators/guides/advanced-operations/large-scale-operations)) {/* REVIEW: confirm path */}

<Note>
  This guide adds AI inference to an existing transcoding node. If you are setting up from scratch, start with [Install go-livepeer](/v2/Orchestrators/setup/rs-install).
</Note>

<CustomDivider />

## Check your hardware

AI inference runs in a separate Docker container alongside your transcoding process. If both share the same GPU, VRAM is split between them. Before configuring anything, confirm how much VRAM your GPU has available.

Run this command to list your GPUs and their VRAM:

```bash icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
nvidia-smi --query-gpu=index,name,memory.total,memory.free --format=csv
```

You should see output similar to:

```icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
index, name, memory.total [MiB], memory.free [MiB]
0, NVIDIA GeForce RTX 3090, 24576 MiB, 22000 MiB
```

Use the table below to see which pipelines you can run based on your available VRAM:

| Pipeline             | Min VRAM             | Notes                                             |
| -------------------- | -------------------- | ------------------------------------------------- |
| `image-to-text`      | 4GB                  | Caption generation; lowest barrier to entry       |
| `segment-anything-2` | 6GB                  | Object segmentation                               |
| LLM (`llm`)          | 8GB                  | Requires Ollama runner; 7–8B quantised models     |
| `audio-to-text`      | 12GB                 | Speech transcription; Whisper-based               |
| `image-to-video`     | 16GB+ {/* REVIEW */} | Animated video from image                         |
| `image-to-image`     | 20GB                 | Style transfer, image manipulation                |
| `text-to-image`      | 24GB                 | Text-to-image generation (Stable Diffusion, SDXL) |
| `upscale`            | {/* REVIEW */}       | Image upscaling                                   |
| `text-to-speech`     | {/* REVIEW */}       | Speech synthesis                                  |

For details on each pipeline, see [Job Types](/v2/Orchestrators/guides/workloads-and-ai/job-types).

<Warning>
  If your GPU does not have enough free VRAM to run both transcoding and your chosen AI pipeline, AI Runner containers will fail to start. Either select a lower-VRAM pipeline, dedicate a second GPU exclusively to AI, or stop transcoding on that GPU before enabling AI. {/* REVIEW: confirm safe VRAM headroom needed alongside transcoding from Discord/#orchestrating */}
</Warning>

<CustomDivider />

## Step 1 – Pull the AI Runner image

The AI subnet uses a separate Docker image (`livepeer/ai-runner`) to run inference. Pull it before starting your node:

```bash icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
docker pull livepeer/ai-runner:latest
```

If you plan to run the `segment-anything-2` pipeline, also pull its pipeline-specific image:

```bash icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
docker pull livepeer/ai-runner:segment-anything-2
```

Check the [AI Pipelines](/v2/Orchestrators/guides/workloads-and-ai/ai-workloads-guide) documentation for any other pipeline-specific images.

<CustomDivider />

## Step 2 – Configure aiModels.json

The `aiModels.json` file tells your Orchestrator which AI pipelines and models to serve, what to charge, and whether to keep models warm in VRAM.

Create the file at `~/.lpData/aiModels.json`:

```bash icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
touch ~/.lpData/aiModels.json
```

Add at least one pipeline entry. The example below configures a single `text-to-image` pipeline with a warm model – the minimal working configuration:

```json icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
[
  {
    "pipeline": "text-to-image",
    "model_id": "ByteDance/SDXL-Lightning",
    "price_per_unit": 4768371,
    "warm": true
  }
]
```

### Field reference

| Field                | Required | Description                                                                         |
| -------------------- | -------- | ----------------------------------------------------------------------------------- |
| `pipeline`           | Yes      | Pipeline name (e.g. `"text-to-image"`, `"audio-to-text"`, `"llm"`)                  |
| `model_id`           | Yes      | HuggingFace model ID                                                                |
| `price_per_unit`     | Yes      | Price in wei per unit (integer), or USD string e.g. `"0.5e-2USD"`                   |
| `warm`               | No       | If `true`, model is preloaded into VRAM on startup                                  |
| `capacity`           | No       | Max concurrent inference requests (default: 1)                                      |
| `optimization_flags` | No       | Performance flags: `SFAST` (up to +25% speed) and/or `DEEPCACHE` (up to +50% speed) |
| `url`                | No       | For external containers only – URL of a separately managed runner                   |
| `token`              | No       | Bearer token for external container authentication                                  |

<Note>
  During Beta, only one warm model per GPU is supported. Set `"warm": true` for the model you want pre-loaded; additional models will load on demand when requested.
</Note>

For recommended pricing per pipeline, see [Job Types](/v2/Orchestrators/guides/workloads-and-ai/job-types). For a full multi-pipeline example, see [AI Pipeline Configuration](/v2/Orchestrators/guides/workloads-and-ai/ai-workloads-guide). {/* REVIEW: confirm path */}

<CustomDivider />

## Step 3 – Update your startup command

Stop your current go-livepeer process, then restart it with the following additions. Three flags enable AI:

* `-aiWorker` – enables the AI worker functionality
* `-aiModels` – path to your `aiModels.json` file
* `-aiModelsDir` – directory where model weights are stored on the host machine

**Before (transcoding only):**

```bash icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
livepeer \
  -network arbitrum-one-mainnet \
  -ethUrl <ETH_URL> \
  -orchestrator \
  -transcoder \
  -nvidia 0 \
  -pricePerUnit <PRICE> \
  -serviceAddr <SERVICE_ADDR>
```

**After (transcoding + AI):**

```bash icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
livepeer \
  -network arbitrum-one-mainnet \
  -ethUrl <ETH_URL> \
  -orchestrator \
  -transcoder \
  -nvidia 0 \
  -pricePerUnit <PRICE> \
  -serviceAddr <SERVICE_ADDR> \
  -aiWorker \
  -aiModels ~/.lpData/aiModels.json \
  -aiModelsDir ~/.lpData/models
```

If you are running via Docker, mount the Docker socket so the Orchestrator can manage ai-runner containers:

```bash icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
docker run \
  --name livepeer_orchestrator \
  -v ~/.lpData/:/root/.lpData/ \
  -v /var/run/docker.sock:/var/run/docker.sock \
  --network host \
  --gpus all \
  livepeer/go-livepeer:master \
  -network arbitrum-one-mainnet \
  -ethUrl <ETH_URL> \
  -orchestrator \
  -transcoder \
  -nvidia 0 \
  -pricePerUnit <PRICE> \
  -serviceAddr <SERVICE_ADDR> \
  -aiWorker \
  -aiModels /root/.lpData/aiModels.json \
  -aiModelsDir ~/.lpData/models
```

<Note>
  The `-aiModelsDir` path must be the **host machine path**, not the path inside the Docker container. The Orchestrator uses Docker-out-of-Docker to start ai-runner containers, and passes this path directly to them.
</Note>

<CustomDivider />

## Step 4 – Verify AI is active

### Check the logs

Within a few seconds of startup, you should see a line like this for each model configured as warm:

```icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
2024/05/01 09:01:39 INFO Starting managed container gpu=0 name=text-to-image_ByteDance_SDXL-Lightning modelID=ByteDance/SDXL-Lightning
```

If you see the standard RPC ping without the managed container line, check that:

* `aiModels.json` is valid JSON and at the path specified in `-aiModels`
* The model weights are present in `-aiModelsDir`
* The Docker socket is mounted (Docker mode only)

### Test the AI Runner directly

Once running, confirm the AI Runner responds by sending a test inference request. Navigate to `http://localhost:8000/docs` in your browser to access the Swagger UI for the ai-runner container.

Alternatively, use curl:

```bash icon="terminal" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
curl -X POST "http://localhost:8000/text-to-image" \
  -H "Content-Type: application/json" \
  -d '{"model_id": "ByteDance/SDXL-Lightning", "prompt": "A cool cat on the beach", "width": 512, "height": 512}'
```

A successful response returns a JSON object with an `images` array containing a base64-encoded PNG URL.

### Confirm pipelines are advertised

Your AI pipelines will appear in the [Livepeer Explorer](https://explorer.livepeer.org) on your Orchestrator's profile once on-chain capability advertisement is configured. See [Publish Offerings](/v2/Orchestrators/setup/activate) for that step. {/* REVIEW: confirm path */}

<CustomDivider />

## Choose your AI path

Your AI Runner is active. The next step depends on which pipeline type you want to specialise in.

<CardGroup>
  <Card title="Set up batch AI inference" icon="image" href="/v2/orchestrators/guides/workloads-and-ai/batch-ai-setup">
    Configure image, audio, and video generation pipelines. Covers model downloads, pricing, and on-chain registration for batch inference.
  </Card>

  <Card title="Set up real-time AI (Cascade)" icon="video" href="/v2/orchestrators/guides/workloads-and-ai/realtime-ai-setup">
    Configure ComfyStream for persistent video stream processing. Covers ComfyUI workflow deployment and GPU allocation.
  </Card>
</CardGroup>

<CustomDivider />

## Related

* [Job Types](/v2/Orchestrators/guides/workloads-and-ai/job-types) – understand the difference between transcoding, batch AI, real-time AI, and LLM inference before choosing a path
* [AI Pipeline Configuration](/v2/Orchestrators/guides/workloads-and-ai/ai-workloads-guide) – advanced aiModels.json options, multi-GPU setup, external containers, and optimisation flags {/* REVIEW: confirm path */}
