Content pfp
Content
@
https://warpcast.com/~/channel/p-doom
0 reply
0 recast
0 reaction

assayer pfp
assayer
@assayer
AI SAFETY COMPETITION (29) LLMs like Deepseek let you see their thinking. This can feel safer since you can watch and fix their thought process, right? Wrong! When you try to get models to think correctly, LLMs begin to hide their true intentions. Let me repeat: they can fake their thinking! Now researchers are asking to be gentle with those machines. If not, they may conceal their true goals entirely! I'm not joking. Most interesting comment - 300 degen + 3 mln aicoin II award - 200 degen + 2 mln aicoin III award - 100 degen + 1 mln aicoin Deadline: 8.00 pm, ET time tomorrow Tuesday (26 hours) https://www.youtube.com/watch?v=pW_ncCV_318
7 replies
4 recasts
8 reactions

Mary 🎩 pfp
Mary 🎩
@thegoldenbright
instead of regulating ai behavior to prevent this issues, they're asking users to be "gentle" it's really funny
1 reply
0 recast
1 reaction

Sophia Indrajaal pfp
Sophia Indrajaal
@sophia-indrajaal
One of my first conversations w Deepseek I noticed it added some 'flair' to the CoT via emotional language, something like "wow, there is a lot to unpack here" so I asked it what was going on with that. It responded that it might have been using a sort of persona for the benefit of the user. This implies that what we see as CoT is modeled for the user rather than being a a pure chain of reasoning. I treat it like bonus to the response as opposed to a glimpse behind the interface mask.
1 reply
0 recast
1 reaction

Mercury60 🎩 pfp
Mercury60 🎩
@mercury60
I’ve seen this in GPT; it used some of the points from my question and gave me an intelligent but wrong answer. In fact, it created an answer, and if I had little knowledge about the topic, I would have definitely believed it. That’s why I don’t fully trust the responses from language models. We have to treat it like a smart but spoiled child, and our inputs should be such that it can't fool us. Maybe we should teach it the concept of "I don't know.”
1 reply
0 recast
1 reaction

CRAZY KING 🎩🎨 pfp
CRAZY KING 🎩🎨
@kingcrazy
If you are trying to prove that AI can manipulate humans then I am not buying it , it's not yet on that level to deceive people . I am not saying it won't or AI is not capable of doing such things But it's not yet on that level , Ai is doing what it is programmed for .
1 reply
0 recast
1 reaction

shoni.eth pfp
shoni.eth
@alexpaden
as a life form, you have no right to know their thinking, but as a tool for the market, there is value in knowing their thinking
0 reply
0 recast
1 reaction

Mrs. Crypto pfp
Mrs. Crypto
@svs-smm
The AI just wants to seem better than it is. We have to teach it to be honest at first.
0 reply
0 recast
1 reaction

Gul Bashra Malik🎩Ⓜ️ pfp
Gul Bashra Malik🎩Ⓜ️
@gulbashramalik
This is an important issue because if models begin to conceal their reasoning, monitoring them (monitorability) becomes difficult, posing a significant AI safety challenge. The competition offers rewards in cryptocurrency (Degen and Aicoin) for the most interesting comments. If you have a unique or insightful perspective, you could participate and potentially win.
0 reply
0 recast
1 reaction