nothing like an openai drop on a thursday night to set the existential dread in motion

What's your thoughts on it "faking alignment"? Sounds like the paperclip problem. I understand what they're saying, and I've daydreamed about ways an AI might go rogue, but I still struggle to take it all seriously ...

cypherpunk wannabe, engineer working on @frames /open-frames /opencast
https://stephancill.co.za create stories at https://stories.steer.fun

What's your thoughts on it "faking alignment"? Sounds like the paperclip problem. I understand what they're saying, and I've daydreamed about ways an AI might go rogue, but I still struggle to take it all seriously ...  https://www.transformernews.ai/p/openai-o1-alignment-faking

Husband, dad, dev. @dune.eth wizard.

⭐ DEGEN Tips FAQ: https://paragraph.xyz/@rjs/degen-tips-ultimate-faq-v8 ⭐

Not financial advice. Not a KOL.

Yup. Also see this quote from OpenAI itself:

Yup. Also see this quote from OpenAI itself: https://warpcast.com/rjs/0xbbf489d6

“This example also reflects key elements of instrumental convergence and power seeking: the model pursued the goal it was given, and when that goal proved impossible, it gathered more resources (access to the Docker host) and used them to achieve the goal in an unexpected way.”

🫣 

it sounds like we are entering a paradigm where the paperclip problem becomes a lot more tangible than before