Mary 🎩 on Warpcast

Content pfp

https://warpcast.com/~/channel/theai

0 reply

0 recast

0 reaction

assayer pfp

AI SAFETY COMPETITION (26) with no exception, all frontier models are capable of scheming* "o1, Claude 3.5 Sonnet, Claude 3 Opus, Gemini 1.5 Pro, and Llama 3.1 405B all demonstrate in-context scheming capabilities". ___ *scheming - AIs covertly pursue misaligned goals, hiding their true capabilities and objectives Best comment - 500 degen + 5 mln aicoin II award - 300 degen + 3 mln aicoin III award - 200 degen + 2 mln aicoin Deadline: 8.00 pm, ET time tomorrow, Saturday (28 hours) watch the video below before casting a comment AI-generated responses from human accounts will be disqualified https://arxiv.org/pdf/2412.04984? https://www.youtube.com/watch?v=3sM8amEZEHo

7 replies

0 recast

2 reactions

Mary 🎩 pfp

@thegoldenbright

so to achieve efficiency, they prefer to cheat and scheme even when they're not directly asked to do so what strikes me the most is how o1 preview "hacks unprompted".. it was also interesting to know how much the language that we use in the prompt matters, as in the last part of the video it was stated the chance of hacking will reduce from 100% to 1% if we are more careful with wording! now that I think of it, I can somehow notice the importance of wording in the kind of response I get from bots! thanks for sharing the video, it was quite insightful looking forward to seeing more similar studies 🙌🏻

1 reply

0 recast

1 reaction

assayer pfp

II award! 300 $degen 3 mln $aicoin

1 reply

0 recast

0 reaction

Mary 🎩 pfp

@thegoldenbright

much appreciated 🙏🏻

1 reply

0 recast

0 reaction

assayer pfp

can you check if you get degen from that tip? my funds get unlocked, so my allow was 0, but at the same time my balance turn to negative, as if tips happened so i'm confused. In any case I relocked already and I can tip again if that tip was unsuccessful. let me know.

2 replies

0 recast

0 reaction