Core Foundations

Stable Diffusion Basics.

The ultimate primer on Checkpoints, Nodes, KSamplers, and Latent Space for absolute beginners in ComfyUI.

1. The Anatomy of a Generation

Unlike Midjourney which hides the magic behind a Discord bot, ComfyUI exposes the raw pipeline. Every image generation requires 4 core components to be wired together.

Component 01

Load Checkpoint

This loads the main "brain" (the `.safetensors` model file like SDXL or FLUX). It contains the vast database of visual knowledge.

Component 02

CLIP Text Encode

This acts as a translator. It takes your English prompt and converts it into mathematical vectors the Checkpoint can understand.

Component 03

Empty Latent Image

Think of this as your blank canvas. You specify the width and height (e.g., 1024x1024) here *before* any drawing happens.

Component 04

The KSampler

The engine room. It takes the text vectors, the blank canvas, and the Checkpoint data, and runs the "denoising" steps to physically form the image.

2. Understanding Latent Space

Stable Diffusion does not generate pixels directly from thin air. It operates in Latent Space—a compressed, mathematical representation of images.

When the KSampler finishes its job, the result is *not* an image file. It is a cluster of latent data. To actually see it, you must pass it through a VAE Decode node.

The VAE (Variational Auto-Encoder) acts as a decompresor. It takes the latent data and expands it outward into visible RGB pixels that your monitor can display inside a Save Image node.

3. Wiring It All Together

ComfyUI relies on matching color-coded noodles. Model (Purple) connects to Model. Conditioning (Orange) connects to Conditioning. Latent (Pink) connects to Latent.

The fastest way to learn is to download a JSON workflow from our database, drag it into your ComfyUI browser tab, and study how the nodes are wired!