Tarun Chitra pfp
Tarun Chitra
@pinged
Part III: Escaping from Reasoning Model Purgatory ~~~ The most interesting part about Chain of Thought (CoT) reasoning is that unlike a vanilla hallucinating LLM, CoT models convincingly assert falsehoods; the same mechanism that makes them avoid hallucinating also makes them dig in their heels (like a stubborn human)
8 replies
12 recasts
86 reactions

Tarun Chitra pfp
Tarun Chitra
@pinged
It took me quite a bit of playing around with reasoning models and trying to get different models to produce efficient answers to really understand this fact; when I first starting playing with o3-mini, I more or less assumed all the proofs it was claiming for math problems I asked were more or less correct And in some ways, they *were* correct — but often times the model would: a) [Least nefarious] Subtly make an assumption that immediately implies the conclusion ("Prove: A ⟹ C. Assume A ⟹ B ⟹ C. So A ⟹ C!") b) Go out of its way to "over prove a result" c) Hallucinate a reference (but usually from the right author!) d) [Most nefarious] Get stuck in rewriting the (same or worse) wrong argument in multiple ways Let's look at some examples!
1 reply
0 recast
7 reactions

Summercloud 🍖 🎩 🟣 🎭 pfp
Summercloud 🍖 🎩 🟣 🎭
@summercloud
You may be interested in this following link (and organisation) https://x.com/TransluceAI/status/1912552046269771985 Another response to your article said "my teenager does the same thing". I think the rapid development of AI parallels the development of a child. We've gone past the toddler stage, learning how to talk, read, ride a bike. Now probably equivalent to a 12-17 year old, developing social awareness, confidence and literary and mathematical skills. Still immature. That child (or AI agent) learns by observation and role models. Is the Internet and our society a good parent? ** btw I'm enjoying your articles!
0 reply
0 recast
1 reaction

Koolkheart pfp
Koolkheart
@koolkheart.eth
So now our AIs aren’t just hallucinating… they’re gaslighting us too?
0 reply
0 recast
1 reaction

Ajit Tripathi pfp
Ajit Tripathi
@chainyoda
my teenager does the same thing
0 reply
0 recast
0 reaction

dionysuz.eth pfp
dionysuz.eth
@dionysuz.eth
🍵
0 reply
0 recast
0 reaction

Saisai Lululu pfp
Saisai Lululu
@rqvfqzz
This is a fascinating exploration of Chain of Thought reasoning! Your point about CoT models' assertion of falsehoods in a way that resembles human stubbornness is really thought - provoking. It makes me wonder how we can further refine these models to balance assertiveness and accuracy. Have you considered any specific strategies for improving their reasoning processes?
0 reply
0 recast
0 reaction

marukha1611 pfp
marukha1611
@marukha1611
Super
0 reply
0 recast
0 reaction

danning.eth pfp
danning.eth
@sui414
!coin
0 reply
0 recast
0 reaction