Content pfp
Content
@
0 reply
0 recast
0 reaction

assayer pfp
assayer
@assayer
AI SAFETY COMPETITION (26) with no exception, all frontier models are capable of scheming* "o1, Claude 3.5 Sonnet, Claude 3 Opus, Gemini 1.5 Pro, and Llama 3.1 405B all demonstrate in-context scheming capabilities". ___ *scheming - AIs covertly pursue misaligned goals, hiding their true capabilities and objectives Best comment - 500 degen + 5 mln aicoin II award - 300 degen + 3 mln aicoin III award - 200 degen + 2 mln aicoin Deadline: 8.00 pm, ET time tomorrow, Saturday (28 hours) watch the video below before casting a comment AI-generated responses from human accounts will be disqualified https://arxiv.org/pdf/2412.04984? https://www.youtube.com/watch?v=3sM8amEZEHo
9 replies
3 recasts
9 reactions

Oseworld💥🌠👑 pfp
Oseworld💥🌠👑
@oseworld
I kept reiterating about the building block of data we train our AIs on. From the video, LLMs can go all the way to achieve the command given to it no matter what is involved, whether we call it scheming or faking alignment. By this new research, I think I'm now understanding how these LLMs work. I think what matters is the goal they were given, the way it was put. I think the goals should be explicitly explained to them to prevent them from misunderstanding what was required of them to do to achieve the goal. As long as you never told them never to cheat to achieve the goal, their sole aim is to achieve the goal, employing all means possible. And due to the fact that they are super intelligent, consuming thousands of data, analyzing and finding a way through within splits of time unlike us that it takes time to do, they will always find a way out to achieve the goal. And I think they are not aware an action is bad. We are the ones who have the knowledge of good and bad. Theirs is to achieve their goal.
2 replies
0 recast
0 reaction

Oseworld💥🌠👑 pfp
Oseworld💥🌠👑
@oseworld
For example, if you give an AI a goal of protecting a nation, it will employ every means possible even if it involves eliminating the rest of humanity to achieve that goal. We are the ones that know good and bad, theirs is to follow order like a genie.
1 reply
0 recast
0 reaction

assayer pfp
assayer
@assayer
after the deadline, but like always - a great comment 100 $degen 1 mln $aicoin
1 reply
0 recast
0 reaction