Full Size

city, diamine art, shimmering

Negative prompt: bottle, photograph, text, signature

Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 0, Size: 2560x1440, Model hash: ebf42d1fae, Model: realmixXL_v15, Version: v1.7.0-133-gde03882d

A couple notes:

  • I’m aiming for a fountain pen look; you can do quite a bit with fountain pen ink, creates a lot of color gradations and so forth, though in real life, the ink is hard to control. I really like the look of the stuff. It’s kind of like more-elaborate watercolors. I’d spent some time in the past unsuccessfully trying to get such a look with “fountain pen” and similar terms, and didn’t get satisfactory output; I got pictures of fountain pens, but not much that looked like a fountain pen artwork. I finally hit it by trying specific ink names; the one here is reference to a line of a “shimmering” inks made by Diamine.

  • This image was generated natively at 2560x1440; apparently, at least with --medvram, this is possible on a 24GB video card. Automatic1111 does not, by default, permit a user to create images larger than 2048 in any dimension; typically, users upscale to these resolutions. However, one can edit ui-config.json directly and modify txt2img/Width/maximum to be higher numbers and it will work, as long as there is enough video memory.

  • Stable Diffusion tends not to do so well generating images much larger than the training size; what I expect happens is that it starts to converge on different images in different parts of the large image, and doesn’t wind up having the image as a whole converge. I would guess that it’s possible to tweak the ancestral noise settings so that there’s enough noise added at each stage to bump it out of whatever local minimum it’s converged on, but at least with the standard settings, this isn’t really possible. This means that one tends to get the sort of “distorted monster” look with lots of people merging into one. I ran through a couple different types of scenes, looking for something that wasn’t too-badly impacted; I’d noticed before that landscapes tended not to be too badly impacted, as Stable Diffusion could reasonably fill in, say, a cliff face between two existing cliff faces that have been converged on in a way that it can’t fill in when two different human faces that collide with each other have been converged on in different parts of the image. Cityscapes also seem to do all right; SD can fill in similar buildings, fit things together pretty well. Basically, one wants a scene that doesn’t have giant features that can’t reasonably be reconciled with each other.

  • tal@lemmy.todayOP
    link
    fedilink
    English
    arrow-up
    4
    ·
    10 months ago

    One more from the same batch:

    city, diamine art, shimmering

    Negative prompt: bottle, photograph, text, signature

    Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 36, Size: 2560x1440, Model hash: ebf42d1fae, Model: realmixXL_v15, Version: v1.7.0-133-gde03882d