Configuration File Format
Orchestrators specify supported AI models in anaiModels.json
file, typically
located in the ~/.lpData
directory. Below is an example configuration showing
currently recommended models and their respective prices.
Pricing used in this example is subject to change and should be set
competitively based on market research and costs to provide the compute.
Key Configuration Fields
During the Beta phase, only one “warm” model per GPU is supported.
The inference pipeline to which the model belongs (e.g.,
text-to-image
).The price in Wei per unit, which
varies based on the pipeline (e.g., per pixel for
image-to-video
).If
true
, the model is preloaded on the GPU to decrease runtime.Optional flags to enhance performance (details below).
Optional URL and port where the model container or custom container manager software is running.
See External Containers
Optional token required to interact with the model container or custom container manager software.
See External Containers
Optional capacity of the model. This is the number of inference tasks the model can handle at the same time. This defaults to 1.
See External Containers
Optimization Flags
These flags are still experimental and may not always perform as expected.
If you encounter any issues, please report them to the go-livepeer
repository.
At this time, these flags are only compatible with warm models.
Image-to-video Pipeline Optimization
The
SFAST
flag enables the Stable
Fast optimization framework,
potentially boosting inference speed by up to 25% with no quality
loss. Cannot be used in conjunction with DEEPCACHE
.Text-to-image Pipeline Optimization
DO NOT enable
DEEPCACHE
for Lightning/Turbo models since they’re already
optimized. Due to known
limitations, it does not
provide speed benefits and may significantly lower image quality.The
DEEPCACHE
flag enables the
DeepCache optimization framework,
which can enhance inference speed by up to 50% with minimal quality
loss. The speedup becomes more pronounced as the number of inference steps
increases. Cannot be used simultaneously with SFAST
.External Containers
This feature is intended for advanced users. Incorrect setup can lead to a
lower orchestrator score and reduced fees. If external containers are used,
it is the Orchestrator’s responsibility to ensure the correct container with
the correct endpoints is running behind the specified
url
.url
, capacity
and token
fields in the
model configuration. The only requirement is that the url
specified responds as expected to the AI Worker same
as the managed containers would respond (including http error codes). As long as the container management software
acts as a pass through to the model container you can use any container management software to implement the custom
management of the runner containers including Kubernetes, Podman,
Docker Swarm, Nomad, or custom scripts to
manage container lifecycles based on request volume
- The
url
set will be used to confirm a model container is running at startup of the AI Worker using the/health
endpoint. Inference requests will be forwarded to theurl
same as they are to the managed containers after startup. - The
capacity
should be set to the maximum amount of requests that can be processed concurrently for the pipeline/model id (default is 1). If auto scaling containers, take care that the startup time is fast if settingwarm: true
because slow response time will negatively impact your selection by Gateways for future requests. - The
token
field is used to secure the model containerurl
from unauthorized access and is strongly suggested to use if the containers are exposed to external networks.