Content pfp
Content
@
https://warpcast.com/~/channel/p-doom
0 reply
0 recast
0 reaction

assayer pfp
assayer
@assayer
AI SAFETY COMPETITION (29) LLMs like Deepseek let you see their thinking. This can feel safer since you can watch and fix their thought process, right? Wrong! When you try to get models to think correctly, LLMs begin to hide their true intentions. Let me repeat: they can fake their thinking! Now researchers are asking to be gentle with those machines. If not, they may conceal their true goals entirely! I'm not joking. Most interesting comment - 300 degen + 3 mln aicoin II award - 200 degen + 2 mln aicoin III award - 100 degen + 1 mln aicoin Deadline: 8.00 pm, ET time tomorrow Tuesday (26 hours) https://www.youtube.com/watch?v=pW_ncCV_318
7 replies
4 recasts
8 reactions

Mercury60 🎩 pfp
Mercury60 🎩
@mercury60
I’ve seen this in GPT; it used some of the points from my question and gave me an intelligent but wrong answer. In fact, it created an answer, and if I had little knowledge about the topic, I would have definitely believed it. That’s why I don’t fully trust the responses from language models. We have to treat it like a smart but spoiled child, and our inputs should be such that it can't fool us. Maybe we should teach it the concept of "I don't know.”
1 reply
0 recast
1 reaction

assayer pfp
assayer
@assayer
this concept is so much needed that i suspect AI builders are simply not able to implement it. thank you for sharing your experience second award! 200 $degen 2 mln $aicoin
0 reply
0 recast
0 reaction