How to Run Qwen-Image-2512 Locally in ComfyUI

Alibaba’s Qwen team has released Qwen-Image-2512, the latest open-source image generation model that benchmarks suggest rivals proprietary giants like Google’s Gemini 3 Pro.

For users wanting to run this model on their own hardware without cloud subscriptions, optimization team Unsloth has released quantized GGUF versions. These allow the massive 20B parameter model to run efficiently on consumer GPUs using ComfyUI.

I am testing Qwen Image 2512 using comfyUI portable powered with RTX3090 and 64GB DD5 RAM. Download the Workflow from here.

The first result is shared below to give you an idea how it performs.

Cat playing soccer Ai Generated Image using Qwen Image 2512

The Setup Guide

Running this model locally requires the specific ComfyUI-GGUF custom node and the optimized model weights. Unsloth uses a “Dynamic 2.0” methodology, which selectively keeps important layers in higher precision to maintain image quality while reducing VRAM usage.

How to Run Qwen-Image-2512 Locally in ComfyUI

ComfyUI: Ensure you have the latest version installed.
Custom Node: Install the ComfyUI-GGUF node by city96 via the ComfyUI Manager.
Hardware: A GPU with at least 12GB VRAM is recommended for decent performance, though highly quantized versions (like Q2_K) fit in less.

Step-by-Step Instructions:

Download the Model: Visit the Unsloth Hugging Face repository and download a .gguf file.
- Recommendation: The Q4_K_M (13.1 GB) offers a good balance of speed and quality. The Q8_0 (21.8 GB) provides near-lossless quality for high-end cards.
Place Files: Move the downloaded .gguf file into your ComfyUI/models/unet or ComfyUI/models/diffusion_models folder.
Load Workflow: Download the workflow JSON or drag the workflow image into your ComfyUI window.
Connect Nodes: Ensure the “Unet Loader (GGUF)” node is selected and points to your downloaded Qwen model file.

Performance and Specs

The Qwen-Image-2512 model excels in two specific areas: human realism and text rendering.

Realism: Drastically reduces the “plastic AI skin” look, adding natural pores and texture.
Files Sizes: The quantized models range from 7.22 GB (Q2_K) up to 40.9 GB (BF16 unquantized).
Architecture: It uses a transformer-based architecture similar to recent SOTA models, optimized here by Unsloth to run up to 2x faster than standard implementations.

Running Qwen-Image-2512 locally offers significant privacy benefits. Unlike cloud-based generators where your prompts and resulting images are processed on external servers, local execution ensures end-to-end data isolation. No data leaves your machine, giving you full ownership of your creative outputs and prompt history. This is critical for enterprise users or those working with sensitive intellectual property.

The Qwen-Image-2512 GGUF weights are available immediately for download. While the model is fully open-source under the Apache 2.0 license, users should verify their hardware capabilities against the file sizes before downloading.

Huggingface Official Download