The Ultimate ComfyUI Guide for Beginners

ComfyUI allows you to build flexible and reproducible visual flows for Stable Diffusion.
Master text-to-image, i2i, SDXL, in/outpainting, upscale, and ControlNet with key nodes.
Enhance with embeddings, LoRA, and custom nodes; use the Manager to manage them.
Optimize performance and stability with best practices, shortcuts, and troubleshooting.

¿The ultimate ComfyUI guide for beginners? If you're taking your first steps with ComfyUI and are overwhelmed by all the nodes, boxes, and cables, don't worry: here you'll find a real guide, one that starts from scratch and doesn't skip over anything important. The goal is for you to understand what each piece does, how they fit together, and how to solve common mistakes. that are frustrating when you try to learn just by experimenting.

In addition to covering classic text-to-image, image-to-image, inpainting, outpainting, SDXL, upscaling, ControlNet, embeddings, and LoRA workflows, we will also integrate installation, configuration, custom node management with the AdministratorShortcuts and a practical section with real performance recommendations for CPU and GPU. And yes, we'll also cover... How to work with video using Wan 2.1 type models (text to video, image to video and video to video) within the ComfyUI ecosystem.

What is ComfyUI and how does it compare to other GUIs?

ComfyUI is a node-based visual interface built on Stable Diffusion which allows you to set up workflows by connecting functional blocks. Each node performs a specific task (load model, encode text, sample, decode) and the edges connect its entrances and exits, as if you were assembling a visual recipe.

Compared to AUTOMATIC1111, ComfyUI stands out for being Lightweight, flexible, transparent, and very easy to share (Each workflow file is reproducible). The downside is that the interface can vary depending on the workflow author, and for casual users, Going into so much detail might seem excessive..

The learning curve smooths out when you understand the "why" behind the nodes. Think of ComfyUI as a dashboard where you see the complete image path: from the initial text and noise in latent form, to the final decoding to pixels.

Installation from scratch: quick and hassle-free

The most direct way is to download the official package for your system, unzip it, and run it. You don't need to install Python separately because it comes embedded., which greatly reduces initial friction.

Basic steps: Download the compressed file, unzip it (for example, with 7-Zip) and run the launcher that suits you. If you don't have a GPU or your graphics card isn't compatible, use the CPU executable.It will take longer, but it works.

To get everything started, place at least one model in the checkpoints folder. You can get them from repositories like Hugging Face or Civitai and place them in the ComfyUI model path.

If you already have a model library in other folders, edit the extra paths file (extra_model_paths.yaml) by removing “example” from the name and adding your locations. Restart ComfyUI so that it detects the new directories.

Basic controls and interface elements

On the canvas, zoom is controlled with the mouse wheel or pinch gesture, and you scroll by dragging with the left button. To connect nodes, drag from the output connector to the input connector., and release to create the edge.

ComfyUI manages an execution queue: configure your workflow and press the queue button. You can check the status from the queue view to see what's running. or what he/she expects.

Exclusive content - Click Here How to remove the login PIN in Windows 11 step by step

Useful shortcuts: Ctrl+C/Ctrl+V to copy/paste nodes, Ctrl+Shift+V to paste while maintaining entries, Ctrl+Enter to enqueue, Ctrl+M to mute a node. Click the dot in the upper left corner to minimize a node and clear the canvas.

From text to image: the essential flow

The minimum flow includes loading the checkpoint, encoding the positive and negative prompt with CLIP, creating an empty latent image, sampling with KSampler, and decoding to pixels with VAE. Press the queue button and you'll get your first image.

Select the model in Load Checkpoint

The Load Checkpoint node returns three components: MODEL (noise predictor), CLIP (text encoder), and VAE (image encoder/decoder). MODEL goes to the KSampler, CLIP to the text nodes, and VAE to the decoder..

Positive and negative prompts with CLIP Text Encode

Enter your positive prompt above and your negative one below; both are encoded as embeddings. You can weight words with the syntax (word:1.2) or (word:0.8) to reinforce or soften specific terms.

Latent voids and optimal sizes

Empty Latent Image defines the canvas in latent space. For SD 1.5, 512×512 or 768×768 is recommended; for SDXL, 1024×1024.The width and height must be multiples of 8 to avoid errors and respect the architecture.

VAE: from latent to pixels

VAE compresses images to latent values and reconstructs them to pixels. In text-to-image conversion, it is typically only used at the end to decode the latent value. Compression speeds up the process but can introduce small lossesIn return, it offers fine control in latent space.

KSampler and key parameters

The KSampler applies reverse diffusion to remove noise according to the embeddings' guide. Seed, steps, sampler, scheduler and denoise These are the main dials. More steps usually provide more detail, and denoise=1 completely rewrites the initial noise.

Image by image: redo with guide

The i2i flow starts with an input image plus your prompts; the denoise controls how much it deviates from the original. With a low denoise, you get subtle variations; with a high one, profound transformations..

Typical sequence: select the checkpoint, load your image as input, adjust prompts, define denoise in KSampler and enqueue. It's ideal for improving compositions or migrating styles without starting from scratch..

SDXL on ComfyUI

ComfyUI offers early support for SDXL thanks to its modular design. Simply use an SDXL-compatible flow, check the prompts, and run it. Remember: larger native sizes require more VRAM and processing time.But the qualitative leap in detail makes up for it.

Inpainting: edit only what interests you

When you want to modify specific areas of an image, inpainting is the tool to use. Load the image, open the mask editor, paint what you want to regenerate, and save it to the corresponding node. Define your prompt to guide the editing and adjust the denoise (for example, 0.6).

If you use a standard model, it works with VAE Encode and Set Noise Latent Mask. For dedicated inpainting models, replace those nodes with VAE Encode (Inpaint), which is optimized for that task.

Outpainting: enlarging the edges of the canvas

To expand an image beyond its boundaries, add the padding node for outpainting and configure how much each side grows. The feathering parameter smooths the transition between original and extension.

In outpainting flows, adjust VAE Encode (for Inpainting) and the grow_mask_by parameter. A value higher than 10 usually offers more natural integrations in the expanded area.

Exclusive content - Click Here Amazon launches Vega OS on Fire TV: changes, apps, and availability

Upscale in ComfyUI: pixel vs latent

There are two ways: pixel upscaling (fast, without adding new information) and latent upscaling, also called Hi-res Latent Fix, which reinterprets details when scaling. The first is fast; the second enriches textures but can deviate.

Algorithm-based upscaling (pixel)

With the rescaling node by method you can choose bicubic, bilinear or nearest-exact and the scale factor. It's ideal for previews or when you need speed. without adding inference cost.

Upscale with model (pixel)

Use Load Upscale Model and the corresponding upscale node, choose a suitable model (e.g., realistic or anime) and select ×2 or ×4. Specialized models recover contours and sharpness better than classic algorithms.

Upscale in latent

Scale the latent and resample with KSampler to add detail consistent with the prompt. It's slower, but especially useful when you want to gain resolution and visual complexity..

ControlNet: Advanced Structural Guide

ControlNet allows you to inject reference maps (edges, pose, depth, segmentation) to guide the composition. Combined with Stable Diffusion, it gives you fine control over the structure without sacrificing the creativity of the model.

In ComfyUI, integration is modular: you load the desired map, connect it to the ControlNet block, and link it to the sampler. Try different controllers to see which one fits your style and purpose..

ComfyUI Administrator: Terminalless Custom Nodes

The Manager allows you to install and update custom nodes from the interface. You'll find it in the queuing menu. It's the simplest way to keep your node ecosystem up to date.

Install missing nodes

If a workflow alerts you to missing nodes, open the Manager, click Install Missing, restart ComfyUI, and update your browser. This resolves most dependencies in a couple of clicks..

Update custom nodes

From the Manager, check for updates, install them, and click the update button on each available package. Restart ComfyUI to apply the changes. and avoid inconsistencies.

Load nodes into the flow

Double-click on an empty area to open the node finder and type the name of the one you need. This is how you quickly insert new pieces into your diagrams.

Embeddings (text inversion)

Embeddings inject trained concepts or styles into your prompts using the keyword embedding:name. Place the files in the models/embeddings folder so that ComfyUI can detect them..

If you install the custom scripts package, you'll have autocomplete: start typing "embedding:" and you'll see the available list. This greatly speeds up iteration when managing many templates..

You can also weight them, for example (embedding:Name:1.2) to reinforce by 20%. Adjust the weight as you would with normal prompt terms to balance style and content.

LoRA: adapts style without touching VAE

LoRA modifies the MODEL and CLIP components of the checkpoint, without altering the VAE. They are used to inject specific styles, characters, or objects with lightweight and easy-to-share files.

Basic flow: Select your base checkpoint, add one or more LoRAs, and generate. You can stack LoRA to combine aesthetics and effects.adjusting their intensities if the workflow allows it.

Shortcuts, tricks, and embedded workflows

In addition to the shortcuts mentioned, there are two very practical tips: fix the seed when adjusting distant nodes to avoid recomputing the entire chain, and use groups to move multiple nodes at once. With Ctrl+drag you can select multiple items and with Shift move the group..

Exclusive content - Click Here How to tell if your Windows is activated with a digital license

Another key feature: ComfyUI saves the workflow in the metadata of the PNG it generates. Dragging the PNG onto the canvas retrieves the entire diagram with one clickThis makes it easier to share and reproduce results.

ComfyUI online: create without installing

Comfyui

If you don't want to install anything, there are cloud services with ComfyUI pre-configured, hundreds of nodes, and popular models. They are ideal for testing SDXL, ControlNet, or complex workflows without touching your PC., and many include galleries of ready-made workflows.

From scratch to video: Wan 2.1 on ComfyUI

Some custom nodes allow you to create video from text, transform an image into a sequence, or edit an existing clip. With Wan 2.1 type models you can set up text-to-video, image-to-video, and video-to-video pipelines directly in ComfyUI.

Install the required nodes (via Administrator or manually), download the corresponding model and follow the example flow: encode the prompt and motion parameters, generate frame-by-frame latencies and then decode to frames or a video container. Remember that the cost of time and VRAM increases with resolution and duration.

CPU vs GPU: What performance to expect

It can be generated using a CPU, but it's not ideal in terms of speed. In real-world tests, a powerful CPU can take several minutes per image, while with a suitable GPU the process drops to seconds. If you have a compatible GPU, use it to drastically accelerate performance..

On CPU, reduce size, steps, and node complexity; on GPU, adjust batch and resolution according to your VRAM. Monitor consumption to avoid bottlenecks and unexpected closures.

Custom nodes: manual installation and best practices

If you prefer the classic method, you can clone repositories in the custom_nodes folder using git and then reboot. This method gives you fine control over versions and branches.useful when you need specific functions.

Keep your nodes organized, with regular updates and compatibility notes. Avoid mixing too many experimental versions at once. to avoid introducing errors that are difficult to trace.

Typical troubleshooting

If “install missing nodes” didn’t save the day, check the console/log for the exact error: dependencies, paths, or versions. Check that the width and height are multiples of 8 and that the templates are in the correct folders..

When a workflow fails to react to model selection, forcing the loading of a valid checkpoint usually restores the graph. If a node breaks after updating, try disabling that package or reverting to a stable version..

Fixed seeds, adjusted sizes, and reasonable prompts make debugging easier. If the result degrades after too much tinkering, revert to a basic preset and reintroduce changes one at a time..

For additional help, communities like /r/StableDiffusion are very active and often resolve rare bugs. Sharing the log, graph captures, and node versions speeds up support.

All of the above gives you a complete map: you know what each node is, how they connect, where to place the models, and what to touch to keep the queue moving smoothly. With text-to-image workflows, i2i, SDXL, in/outpainting, upscaling, ControlNet, embeddings, and LoRA, plus video with WAN 2.1, you have a very serious production kit. Ready to grow with you. For more information, please see the ComfyUI official website.

What does Stable Diffusion mean and what is it for?

Cristian Garcia

Passionate about technology since he was little. I love being up to date in the sector and, above all, communicating it. That is why I have been dedicated to communication on technology and video game websites for many years. You can find me writing about Android, Windows, MacOS, iOS, Nintendo or any other related topic that comes to mind.