Content
@
0 reply
0 recast
0 reaction
ππͺπΎπ‘π‘πΎ
@gm8xx8
Sycophancy to subterfuge: Investigating reward tampering in language models - Anthropic https://www.anthropic.com/research/reward-tampering
0 reply
1 recast
9 reactions