Architecture

How the repository turns agent input into ComfyUI execution.

The architecture is intentionally simple: expose workflows, map a small parameter surface, queue the job, poll for completion, and download images back to local disk.

System Model

Core components

SKILL.md: the agent-facing contract that explains how the skill is discovered and called.
scripts/registry.py: lists enabled workflows and the parameters exposed to the agent.
scripts/comfyui_client.py: injects args into a workflow, submits the prompt, waits, and downloads images.
scripts/server_manager.py: manages multi-server configuration from CLI.
ui/: FastAPI plus the local dashboard for servers, workflows, and mapping edits.

Execution Flow

From natural language to image file

The agent asks the registry which workflows are enabled.
The repository resolves user intent into structured args.
The client maps those args into ComfyUI node fields using schema.json.
The client calls native ComfyUI endpoints such as /prompt, /history/{prompt_id}, and /view.
The output images are downloaded to local storage and returned to the caller.

Storage Model

How workflows are organized on disk

data/
  <server_id>/
    <workflow_id>/
      workflow.json
      schema.json

This structure makes workflows portable and easy to inspect. It also gives the repository a clean namespace for multi-server execution.

Why The Schema Layer Matters

ComfyUI graphs are flexible, but agents need contracts

A graph can contain dozens of nodes and many internal fields that should not be exposed directly. The schema layer narrows that surface into a predictable interface with aliases, descriptions, required flags, and types. That is what makes agent calls more reliable and easier to maintain.

Keep exploring

Getting Started Use Cases FAQ