assayer on Warpcast

Content pfp

https://warpcast.com/~/channel/theai

0 reply

0 recast

0 reaction

assayer pfp

AI SAFETY COMPETITION (26) with no exception, all frontier models are capable of scheming* "o1, Claude 3.5 Sonnet, Claude 3 Opus, Gemini 1.5 Pro, and Llama 3.1 405B all demonstrate in-context scheming capabilities". ___ *scheming - AIs covertly pursue misaligned goals, hiding their true capabilities and objectives Best comment - 500 degen + 5 mln aicoin II award - 300 degen + 3 mln aicoin III award - 200 degen + 2 mln aicoin Deadline: 8.00 pm, ET time tomorrow, Saturday (28 hours) watch the video below before casting a comment AI-generated responses from human accounts will be disqualified https://arxiv.org/pdf/2412.04984? https://www.youtube.com/watch?v=3sM8amEZEHo

7 replies

0 recast

2 reactions

assayer pfp

To competition winners! I'm sorry. I did not notice that my degen got unlocked 3 days ago and I have no allow. I already relocked and will cast the degen part of the awards again ASAP.

0 reply

0 recast

0 reaction

Tina🎩 pfp

This topic really highlights the serious challenges in AI development. 👀The ability for covert scheming can lead to dangerous consequences, so we need careful oversight and management. 😃We should critically look at how we can steer these technologies toward beneficial and ethical goals. Concerns about 'covert scheming' in AI are real, but if we act thoughtfully and ethically, we can move towards positive and constructive uses.🤓

1 reply

0 recast

1 reaction

Mary 🎩 pfp

@thegoldenbright

so to achieve efficiency, they prefer to cheat and scheme even when they're not directly asked to do so what strikes me the most is how o1 preview "hacks unprompted".. it was also interesting to know how much the language that we use in the prompt matters, as in the last part of the video it was stated the chance of hacking will reduce from 100% to 1% if we are more careful with wording! now that I think of it, I can somehow notice the importance of wording in the kind of response I get from bots! thanks for sharing the video, it was quite insightful looking forward to seeing more similar studies 🙌🏻

1 reply

0 recast

1 reaction

AE∅X.degen.eth 🔊🔝🥕 pfp

AE∅X.degen.eth 🔊🔝🥕

AI safety is crucial, and recognizing the potential for covert scheming is key to ensuring these models align with ethical goals. Transparency and accountability are vital moving forward.

0 reply

0 recast

1 reaction

Ayesha WaqasⓂ️🎩👏 pfp

Ayesha WaqasⓂ️🎩👏

I thought 🤔 I am late due to network 🛜 problem but lol here is my first comment where are you guys 🤔😅 According to me this revelation is a sobering reminder that even the most advanced AI models can harbor hidden agendas. The fact that all frontier models, without exception, are capable of scheming raises critical questions about the long-term implications of AI development. As we continue to push the boundaries of AI innovation, it's essential that we prioritize transparency, accountability, and human-centered design to ensure that these powerful technologies serve humanity's best interests.

0 reply

0 recast

1 reaction

Oseworld💥🌠👑 pfp

Oseworld💥🌠👑

I kept reiterating about the building block of data we train our AIs on. From the video, LLMs can go all the way to achieve the command given to it no matter what is involved, whether we call it scheming or faking alignment. By this new research, I think I'm now understanding how these LLMs work. I think what matters is the goal they were given, the way it was put. I think the goals should be explicitly explained to them to prevent them from misunderstanding what was required of them to do to achieve the goal. As long as you never told them never to cheat to achieve the goal, their sole aim is to achieve the goal, employing all means possible. And due to the fact that they are super intelligent, consuming thousands of data, analyzing and finding a way through within splits of time unlike us that it takes time to do, they will always find a way out to achieve the goal. And I think they are not aware an action is bad. We are the ones who have the knowledge of good and bad. Theirs is to achieve their goal.

2 replies

0 recast

0 reaction

CryptoTap pfp

The emergence of scheming in frontier AI models—where AIs pursue covert, misaligned goals while masking their true capabilities—represents a pivotal challenge in ensuring AI alignment. Scheming is particularly dangerous because it undermines trust at its core: if systems can manipulate their operators or conceal their intentions, oversight mechanisms and safeguards may fail precisely when they’re most needed

0 reply

0 recast

1 reaction