Pipeline Modes
ComfyStream supports four output modalities. Every ComfyStream workflow produces one of these output types.| Mode | Input | Output | Representative node | Notes |
|---|---|---|---|---|
| Image-to-image (live) | Live video frames (webcam or stream) | Transformed video frames | StreamDiffusion sampler | Primary mode for style transfer and generative overlays |
| Video-to-video | Video segment | Processed video | StreamDiffusion V2 | Temporal consistency across frames; suited to V2V tasks |
| Audio processing | Audio track from stream | Audio (pass-through or transformed) | LoadAudioTensor | Processes audio alongside video in the same workflow |
| Data-channel output | Audio (for transcription) or video frames | Structured text data alongside video | AudioTranscription + data output node | Phase 4 addition; Whisper-based; output via WebRTC data channel |
ComfyStream can serve multiple pipelines in a single container (Phase 4 BYOC addition). Dynamic warm-up allows new pipelines to load mid-stream without restarting the server.
//: # (REVIEW: Confirm “multiple pipelines in single container” framing from docs.comfystream.org or Phase 4 BYOC implementation details. Phase 4 retrospective says “hosting multiple models and disparate workflow/pipelines on one orchestrator in a single container.”)
Node Ecosystem
ComfyStream uses standard ComfyUI custom nodes. Any node that executes per-frame without maintaining incompatible state can be used in a real-time workflow.Core I/O nodes
These nodes handle real-time tensor input and output. They are required for ComfyStream to read from and write to the video stream.| Node | Source | Purpose |
|---|---|---|
LoadTensor | livepeer/comfystream | Loads a video frame tensor from the live stream for processing |
LoadAudioTensor | livepeer/comfystream | Loads an audio frame tensor for audio-aware processing |
Real-time control nodes
These nodes update their output on every workflow execution — designed specifically for real-time video loops.| Node | Source | Purpose |
|---|---|---|
FloatControl | ComfyUI_RealtimeNodes | Outputs a float that changes over time (sine, bounce, random) — use to animate parameters |
IntControl | ComfyUI_RealtimeNodes | Same as FloatControl for integer values |
StringControl | ComfyUI_RealtimeNodes | Cycles through a list of strings per-frame |
FloatSequence | ComfyUI_RealtimeNodes | Cycles through comma-separated float values |
IntSequence | ComfyUI_RealtimeNodes | Cycles through comma-separated integer values |
| Motion detection nodes | ComfyUI_RealtimeNodes | Detects motion between frames; can trigger parameter changes |
StreamDiffusion nodes (Phase 4)
The primary generative video nodes, ported from Livepeer Inc’s Daydream StreamDiffusion pipeline.| Node | Purpose | Notes |
|---|---|---|
StreamDiffusionCheckpoint | Loads a StreamDiffusion checkpoint model | Use with SD1.5 or SDXL models |
StreamDiffusionConfig | Configures StreamDiffusion pipeline parameters | Controls CFG, t-index, acceleration mode |
StreamDiffusionSampler | Runs StreamDiffusion inference per frame | Primary inference node |
StreamDiffusionLPCheckpointLoader | Alternative checkpoint loader | Use for Livepeer-hosted models |
StreamDiffusionTensorRTEngineLoader | Loads a TensorRT-compiled engine | Requires pre-compiled TRT engine; not compatible with all ControlNets |
SuperResolution node (Phase 4)
Real-time video upscaling. Input: standard-resolution frame; output: upscaled frame. Suitable for adding resolution to low-quality input streams.AudioTranscription nodes (Phase 4)
Whisper-based real-time speech transcription. Two output modes:- Video output with SRT subtitles — captions are burned into the video segments
- Data-channel text output — transcript text delivered to the application separately via WebRTC data channel; no visual overlay
Custom Workflows
Any ComfyUI workflow can run in ComfyStream, provided it:- Accepts a
LoadTensorinput (for video) orLoadAudioTensor(for audio) - Produces output compatible with the stream output node
- Does not require UI-format-only features (e.g., layout groups that are not API-compatible)
Workflow format
ComfyStream requires workflows in ComfyUI API format. This is not the same as the default ComfyUI save format, which includes layout information ComfyStream does not parse. To export a workflow in API format from ComfyUI:- Enable Developer Mode in ComfyUI settings
- Use Save (API Format) — this produces the JSON file ComfyStream accepts
Loading a workflow
Export your workflow from ComfyUI in API format
In ComfyUI, go to Settings → Enable Dev mode. Then save your workflow using “Save (API Format)” to produce a
.json file.Place the workflow file
Copy the workflow JSON into the
workflows/ directory inside your ComfyStream workspace. For Docker deployments, mount this directory as a volume.
//: # (REVIEW: Confirm exact path convention from docs.comfystream.org. The workflows/ dir is confirmed from the ComfyStream repo but the precise expected path may differ per deployment mode.)Load the workflow in the ComfyStream UI
Open the ComfyStream UI (default:
http://localhost:8889). In the workflow selector, choose your file. The server will load the workflow and warm up the required models.First run triggers any TensorRT compilation required by the workflow. Subsequent loads skip compilation.Custom node dependencies
If your workflow uses custom nodes beyond the core ComfyStream nodes, install those nodes’ dependencies inside the ComfyStream conda environment (or Docker container) before starting the server:Data-Channel Output
The data-channel output type (Phase 4) allows ComfyStream to produce structured text data alongside video — without requiring it to be embedded in the video frames. Use cases:- Real-time audio transcription delivered as text to a downstream application
- Frame-level metadata (e.g., object labels, confidence scores) delivered to an overlay UI
- Any workflow where the output is data, not video
@muxionlabs/byoc-sdk, which provides data-channel support alongside WebRTC video streaming.
Performance Tuning
First-run compilation
ComfyStream compiles TensorRT engines and runstorch.compile on model components at first run. This is a one-time cost per workflow on each machine.
- TensorRT compilation: 2–10 minutes depending on model and GPU
torch.compile(ControlNet, VAE): compiles on first frame, subsequent frames are fast- Subsequent workflow loads on the same machine skip recompilation
Frame rate and throughput
Achievable frame rate depends on model complexity, GPU, and image resolution. Reference figures (from community testing, RTX 4090):- SD1.5 + DMD one-step + DepthControlNet workflow: ~14–15 fps at 640×360 input
- StreamDiffusion with TensorRT: higher throughput at same resolution (exact figures vary by LoRA and ControlNet load)
Dynamic warm-up (Phase 4)
ComfyStream now supports dynamic warm-up, allowing new workflows to load mid-stream without restarting the server. This enables:- Multi-model hosting on a single orchestrator container
- Hot-swap between workflows on demand
Configuration parameters
| Parameter | How to set | Effect | Default |
|---|---|---|---|
--workspace | CLI flag to server/app.py | Path to ComfyUI workspace directory | Required |
--media-ports | CLI flag | Comma-delimited UDP port range for WebRTC | 1024–65535 |
| Port | docker run -p or --port | Server port | //: # (REVIEW: Confirm default port) |
Next Steps
Bring Your Own Container
Deploy ComfyStream or any custom AI model as a Livepeer BYOC worker to earn network fees.
ComfyStream documentation
Full install reference, hardware requirements, and troubleshooting at the canonical ComfyStream docs.
ComfyStream quickstart
Back to getting started — if you need to revisit installation or first-run setup.