Prompt: Fairy Kei fashion, police, body armor, serious expression, full body shot, patrolling in a street, photography --ar 3:4

I was playing around with the Fairy Kei fashion style. I love all the pink and how well the style blends with the rest of the prompt.


Prompt: Fairy Kei fashion, astronaut, space suit, serious expression, full body shot, in front of a rocket, photography --ar 3:4


Prompt: Fairy Kei fashion, soldier, strong, body armor, serious expression, full body shot, patrolling in a street, photography --ar 3:4


Prompt: Fairy Kei fashion, firefighter, gas mask, helmet, soot, in front of a burning building, full body shot, photography --ar 3:4

Even the smoke is pink! :)

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 year ago

    I’d also add that it’s not, I think, just a matter of learning that rifles never have two stocks facing in opposite directions in real life by throwing more training data of good rifles at it. I mean, I recall a very beautiful AI-generated image of a slope of a green hill that was merging into an ocean wave. It was very aesthetically-pleasing. But…it’s not something that would ever happen in real life, or could make sense. That’s the same as with the reverse stocks on a rifle. Yet we like the hill-wave, but dislike the reverse firearm stocks. It’s not clear to me whether there’s a great set of existing information out there that would let a generative AI distinguish between the two classes of image.

    It is one area where human artists do well – they can use their own aesthetic sense to have a feel for what looks attractive, use that as a baseline. That’s not perfect – what the artist likes, a particular viewer might not like. But it’s a pretty good starting place. A generative AI has to be able to create new images, but without having an easy sense for what combinations might be unattractive.

    I think that one of the interesting things with generative AIs is going to be not just finding what they do well – and they do some things astoundingly (to me) well, like imitating an artist’s style or combining wildly-disparate images in interesting ways. It’s going to be figuring out a number of things that we think are easy that are actually really hard.

    I’m not sure whether making a rifle is going to be one of those – maybe there’s a great way to do that. But there are gonna be some things that are gonna be hard for LLMs.

    At that point, I think that we’re either gonna have to just figure out new ways of solving some of those problems – like, people hardcoded “fixes” for faces into Stable Diffusion back in the pre-XL era, as faces and especially eyes often looked a bit off. Maybe we need to move to systems that have a 3D representation of the images. Or maybe we introduce software that tends to permit for human interaction, to provide for human-assisted decisions in areas that are hard.