Segment-anything-2
Overview
The segment-anything-2
pipeline provides direct access to the
Segment Anything 2 pipeline developed by
Meta AI Research. In its current version, it
supports only image segmentation, enabling it to segment any object in an image.
Future versions will also support direct video input, allowing the object to be
consistently tracked across all frames of a video in real-time. This advancement
will unlock new possibilities for video editing and enhance experiences in mixed
reality. The pipeline is powered by the latest diffusion models from
HuggingFace’s
facebook/sam2-hiera-large.
Models
Warm Models
The current warm model requested for the segment-anything-2
pipeline is:
- facebook/sam2-hiera-large:
The largest model in the Segment Anything 2 model suite, designed for the most accurate image segmentation.
For faster responses with different
segment-anything-2
diffusion models, ask Orchestrators to load it on their GPU via the ai-video
channel in Discord Server.
On-Demand Models
The following models have been tested and verified for the segment-anything-2
pipeline:
If a specific model you wish to use is not listed, please submit a feature request on GitHub to get the model verified and added to the list.
Basic Usage Instructions
For a detailed understanding of the segment-anything-2
endpoint and to
experiment with the API, see the Livepeer AI API
Reference.
To generate an image with the segment-anything-2
pipeline, send a POST
request to the Gateway’s segment-anything-2
API endpoint:
curl -X POST http://<GATEWAY_IP>/segment-anything-2 \
-F model_id="facebook/sam2-hiera-large" \
-F point_coords="[[120,100],[120,50]]" \
-F point_labels="[1,0]" \
-F image=@<PATH_TO_IMAGE>/cool-cat.png
In this command:
<GATEWAY_IP>
should be replaced with your AI Gateway’s IP address.model_id
is the diffusion model for image segmentation.- The
point_coords
field holds the coordinates of the points to be segmented. - The
point_labels
field holds the labels for the points to be segmented. - The
image
field holds the absolute path to the image file to be transformed.
For additional optional parameters, refer to the Livepeer AI API Reference.
After execution, the Orchestrator processes the request and returns the response to the Gateway:
{
"masks": "[[[2.84, 2.83, ...], [2.92, 2.91, ...], [3.22, 3.56, ...], ...]]",
"scores": "[0.50, 0.37, ...]",
"logits": "[[[2.84, 2.66, ...], [3.59, 5.20, ...], [5.07, 5.68, ...], ...]]"
}
Orchestrator Configuration
To configure your Orchestrator to serve the segment-anything-2
pipeline, refer
to the Orchestrator Configuration guide.
System Requirements
The following system requirements are recommended for optimal performance:
- NVIDIA GPU with at least 6GB of VRAM.
Pipeline-Specific Image
To serve the segment-anything-2
pipeline, you must use a pipeline specific AI
Runner container. Pull the required container from
Docker Hub
using the following command:
docker pull livepeer/ai-runner:segment-anything-2
API Reference
API Reference
Explore the segment-anything-2
endpoint and experiment with the API in
Livepeer AI API Reference.
Was this page helpful?