News: OpenAI Introduces Superalignment

Blaed@lemmy.world · 1 year ago

News: OpenAI Introduces Superalignment

svahnen@lemmy.world · 1 year ago

I’m interested but it’s too much to read. If anyone wants to make a tldr I’m coming back for that 😊

Blaed@lemmy.world · edit-2 1 year ago

OpenAI has launched a new initiative, Superalignment, aimed at guiding and controlling ultra-intelligent AI systems. Recognizing the imminent arrival of AI that surpasses human intellect, the project will dedicate significant resources to ensure these advanced systems act in accordance with human intent. It’s a crucial step in managing the transformative and potentially dangerous impact of superintelligent AI.

I like to think this starts to explore interesting philosophical questions like human intent, consciousness, and the projection of will into systems that are far beyond our capabilities in raw processing power and input/output. What may happen from this intended alignment is yet to be seen, but I think we can all agree the last thing we want in these emerging intelligent machines is to do things we don’t want them to do.

‘Superalignment’ is OpenAI’s response in how to put up these safeguards. Whether or not this is the best method is to be determined.

svahnen@lemmy.world · 1 year ago

Ah I see 😊 Thx for the tldr!

2xsaiko@discuss.tchncs.de · 1 year ago

Lmao. A very important theorem in computer science is Rice’s theorem.

In computability theory, Rice’s theorem states that all non-trivial semantic properties of programs are undecidable. A semantic property is one about the program’s behavior (for instance, does the program terminate for all inputs), unlike a syntactic property (for instance, does the program contain an if-then-else statement). A property is non-trivial if it is neither true for every partial computable function, nor false for every partial computable function.

Semantic property here being “is this AI aligned or not?” (without even going into what the exact definition of that would be so that it could be automatically tested).

This is the modern bullshit repackaging of the halting problem.

AbouBenAdhem@lemmy.world · edit-2 1 year ago

Finally, we can test our entire pipeline by deliberately training misaligned models, and confirming that our techniques detect the worst kinds of misalignments (adversarial testing).

Creating psycho AIs for the cop AIs to practice on—what could go wrong?

runiq@lemmyrs.org · 1 year ago

Getting some real Ozymandias vibes here.

_TheNardDog_@lemmy.world · 1 year ago

Do you mean;

“Look on my Works, ye Mighty, and despair!”

Or

“Tony Stark was able to build this in a cave! With a box of scraps!”

Raphael@lemmy.world · 1 year ago

Superintelligence will be the most impactful technology humanity has ever invented, and could help us solve many of the world’s most important problems

Capitalism? You don’t need Super AI for that.