Tag: Illustrious

SD XL has been suffering from CLIP – I think this is true, at least partially. Recent models have shown some potential related to natural language, like understanding "left is red, right is blue". However, since CLIP was not trained with natural language sentences, base SD XL and its finetuned variants were significantly limited regarding processing it.

AngelbottomlessApril 18, 2025About 3 min

Illustrious XL 3.0-3.5-vpred: 2048 Resolution and Natural Language

Illustrious XL 3.0–3.5-vpred represents a major advancement in Stable Diffusion XL (SD XL) modeling, notably supporting resolutions ranging seamlessly from 256 up to 2048. The v3.5-vpred variant particularly emphasizes robust natural language understanding capabilities, comparable in sophistication to miniaturized large language models (LLMs), achieved through extensive simultaneous training of both CLIP and UNet components.

AngelbottomlessMarch 22, 2025About 9 min

Illustrious XL v2.0—The best training base model in 1536 age

Introduction

Illustrious XL 1.0-2.0 series aims to stabilize native generation at 1536 resolution while significantly improving natural language understanding capabilities.

While users sometimes observed successful 1024x1536 resolution generations, these were not stable. Similarly, 512x512 generations occasionally produced unwanted artifacts.

AngelbottomlessMarch 15, 2025About 4 min