A network containing much of its training set is broken.
Deep networks do find heuristics. That’s what all the layers are for. That’s why it takes abundant training, instead of abundant storage. We already had computers that can give you the next word of a Stephen King novel… they’re called e-books.
Tune AI just right and it’ll know that Stephen King writes horror, in English - having distilled both concepts from raw data. Grammar is a demonstration of novel output. The fact these things can conjugate a verb (or count fingers on a hand) is deep magic. There’s hints of them being able to do math, which you’d think is trivial for a supercomputer, except it’d have to be doing math roughly the same way you do math.
Anyway: generative LLMs should ideally contain about as much original data per-subject as its Wikipedia article. Key names, general premise, relevant dates, and then enough labels to cobble together some kind of bootleg.
The trouble comes from people making question-answering LLMs, which for obvious reasons are supposed to contain all the details necessary to pass a pop quiz. This is fundamentally at-odds with making shit up. (It’s also not very good at answering questions, so they should really focus on training a network that can evaluate text instead of training a network on that text.)
Image AI seems entirely focused on making shit up, which makes the blatant overfitting in MidJourney a head-scratcher. Knowing what Darth Vader looks like is a non-event. Everyone knows what Darth Vader looks like, and everyone knows he correlates strongly with laser-swords. Even being able to draw vaguely cinematic frames is whatever, because it turns out a lot of things look like a lot of other things. But some of those Dune examples are trying to pass a pop quiz. That’s just incorrect behavior.
The draw-anything machine should absolutely be able to draw frames that look like they’re from Denis Villeneuve’s adaptation. Key words: look like. Floppy hair, muted colors, recognizable specific actors, sure. Probably even matching the framing of one shot or other, because again, movies look like movies. But if any specific frame is simply being reproduced, the process has gone wrong. That’s simply not what it’s for.
A network containing much of its training set is broken.
Deep networks do find heuristics. That’s what all the layers are for. That’s why it takes abundant training, instead of abundant storage. We already had computers that can give you the next word of a Stephen King novel… they’re called e-books.
Tune AI just right and it’ll know that Stephen King writes horror, in English - having distilled both concepts from raw data. Grammar is a demonstration of novel output. The fact these things can conjugate a verb (or count fingers on a hand) is deep magic. There’s hints of them being able to do math, which you’d think is trivial for a supercomputer, except it’d have to be doing math roughly the same way you do math.
Anyway: generative LLMs should ideally contain about as much original data per-subject as its Wikipedia article. Key names, general premise, relevant dates, and then enough labels to cobble together some kind of bootleg.
The trouble comes from people making question-answering LLMs, which for obvious reasons are supposed to contain all the details necessary to pass a pop quiz. This is fundamentally at-odds with making shit up. (It’s also not very good at answering questions, so they should really focus on training a network that can evaluate text instead of training a network on that text.)
Image AI seems entirely focused on making shit up, which makes the blatant overfitting in MidJourney a head-scratcher. Knowing what Darth Vader looks like is a non-event. Everyone knows what Darth Vader looks like, and everyone knows he correlates strongly with laser-swords. Even being able to draw vaguely cinematic frames is whatever, because it turns out a lot of things look like a lot of other things. But some of those Dune examples are trying to pass a pop quiz. That’s just incorrect behavior.
The draw-anything machine should absolutely be able to draw frames that look like they’re from Denis Villeneuve’s adaptation. Key words: look like. Floppy hair, muted colors, recognizable specific actors, sure. Probably even matching the framing of one shot or other, because again, movies look like movies. But if any specific frame is simply being reproduced, the process has gone wrong. That’s simply not what it’s for.