> ## Documentation Index
> Fetch the complete documentation index at: https://docs.livepeer.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Image-to-Image

## Overview

The `image-to-image` pipeline of the Livepeer AI network enables **advanced
image manipulations** including style transfer, image enhancement, and more.
This pipeline leverages cutting-edge diffusion models from the HuggingFace
[image-to-image](https://huggingface.co/models?pipeline_tag=image-to-image)
pipeline.

<div align="center">
  ```mermaid theme={"theme":{"light":"github-light","dark":"dark-plus"}}
  graph LR
      A[<div style="width: 200px;"><img src="https://mintlify.s3-us-west-1.amazonaws.com/na-36/images/ai/cool-cat.png" alt="Image of cool cat"/></div>] --> B[Gateway]
      E["A hat"] --> B
      B --> C[Orchestrator]
      C --> B
      B --> D[<div style="width: 200px;"><img src="https://mintlify.s3-us-west-1.amazonaws.com/na-36/images/ai/cool-cat-hat.png" alt="Image of cool cat with hat"/></div>]
  ```
</div>

## Models

### Warm Models

The current warm model requested for the `image-to-image` pipeline is:

* [timbrooks/instruct-pix2pix](https://huggingface.co/timbrooks/instruct-pix2pix):
  A powerful diffusion model that edits images to a high-quality standard based
  on human-written instructions.

<Tip>
  For faster responses with different
  [image-to-image](https://huggingface.co/models?pipeline_tag=image-to-image)
  diffusion models, ask Orchestrators to load it on their GPU via the `ai-video`
  channel in [Discord Server](https://discord.gg/livepeer).
</Tip>

### On-Demand Models

The following models have been tested and verified for the `image-to-image`
pipeline:

<Note>
  If a specific model you wish to use is not listed, please submit a [feature
  request](https://github.com/livepeer/ai-worker/issues/new?assignees=\&labels=enhancement%2Cmodel\&projects=\&template=model_request.yml)
  on GitHub to get the model verified and added to the list.
</Note>

<Accordion title="Tested and Verified Diffusion Models">
  * [timbrooks/instruct-pix2pix](https://huggingface.co/timbrooks/instruct-pix2pix): A powerful diffusion model that edits images to a high-quality standard based on human-written instructions.
  * [ByteDance/SDXL-Lightning](https://huggingface.co/ByteDance/SDXL-Lightning): A lightning-fast diffusion model by ByteDance, optimized for high-speed image-to-image transformations.
  * [SG161222/RealVisXL\_V4.0](https://huggingface.co/SG161222/RealVisXL_V4.0): A diffusion model that excels in generating high-quality, photorealistic images.
  * [SG161222/RealVisXL\_V4.0\_Lightning](https://huggingface.co/SG161222/RealVisXL_V4.0_Lightning): A streamlined version of RealVisXL\_V4.0, designed for faster inference while still aiming for photorealism.
  * [stabilityai/sd-turbo](https://huggingface.co/stabilityai/sd-turbo): A robust diffusion model by Stability AI, designed for efficient and high-quality image generation ([limited-commercial use license](https://stability.ai/license)).
  * [stabilityai/sdxl-turbo](https://huggingface.co/stabilityai/sdxl-turbo): An extended version of the sd-turbo model, offering enhanced performance for larger and more complex tasks ([limited-commercial use license](https://stability.ai/license)).
</Accordion>

## Basic Usage Instructions

<Tip>
  For a detailed understanding of the `image-to-image` endpoint and to
  experiment with the API, see the [Livepeer AI API
  Reference](/ai/api-reference/image-to-image).
</Tip>

To generate an image with the `image-to-image` pipeline, send a `POST` request
to the Gateway's `image-to-image` API endpoint:

```bash theme={"theme":{"light":"github-light","dark":"dark-plus"}}
curl -X POST https://<GATEWAY_IP>/image-to-image \
    -F model_id="timbrooks/instruct-pix2pix" \
    -F image=@<PATH_TO_IMAGE>/cool-cat.png \
    -F prompt="a hat"
```

In this command:

* `<GATEWAY_IP>` should be replaced with your AI Gateway's IP address.
* `model_id` is the diffusion model for image generation.
* The `image` field holds the **absolute** path to the image file to be
  transformed.
* `prompt` is the text description for the image.

For additional optional parameters, refer to the
[Livepeer AI API Reference](/ai/api-reference/image-to-image).

After execution, the Orchestrator processes the request and returns the response
to the Gateway:

```json theme={"theme":{"light":"github-light","dark":"dark-plus"}}
{
  "images": [
    {
      "nsfw": false,
      "seed": 3197613440,
      "url": "https://<GATEWAY_IP>/stream/dd5ad78d/7adde483.png"
    }
  ]
}
```

The `url` in the response is the URL of the generated image. Download the image
with:

```bash theme={"theme":{"light":"github-light","dark":"dark-plus"}}
curl -O "https://<STORAGE_ENDPOINT>/stream/dd5ad78d/7adde483.png"
```

## Applying LoRa Models

To apply LoRa filters to an image, include the `loras` field in your request:

```bash theme={"theme":{"light":"github-light","dark":"dark-plus"}}
curl -X POST https://<GATEWAY_IP>/image-to-image \
    -F model_id="ByteDance/SDXL-Lightning" \
    -F image=@<PATH_TO_IMAGE>/cool-cat.png \
    -F prompt="a hat" \
    -F loras='{ "nerijs/pixel-art-xl": 1.2 }'
```

You can find a list of available LoRa models for various models on
[lora-studio](https://huggingface.co/spaces/enzostvs/lora-studio).

## Orchestrator Configuration

To configure your Orchestrator to serve the `image-to-image` pipeline, refer to
the [Orchestrator Configuration](/ai/orchestrators/get-started) guide.

### System Requirements

The following system requirements are recommended for optimal performance:

* [NVIDIA GPU](https://developer.nvidia.com/cuda-gpus) with **at least 20GB** of
  VRAM.

## Recommended Pipeline Pricing

<Note>
  We are planning to simplify the pricing in the future so orchestrators can set
  one AI price per compute unit and have the system automatically scale based on
  the model's compute requirements.
</Note>

The pricing for the `image-to-image` pipeline is based on competitor pricing.
However, we strongly encourage orchestrators to set their own pricing based on
their costs and requirements. Setting a competitive price will help attract more
jobs, as Gateways can set their maximum price for a job. The current recommended
pricing for this pipeline is `1.9073484e-08 USD` per **input pixel**
(`height * width * output images`).

## API Reference

<Card title="API Reference" icon="rectangle-terminal" href="/ai/api-reference/image-to-image">
  Explore the `image-to-image` endpoint and experiment with the API in the
  Livepeer AI API Reference.
</Card>
