Web3Aible✪
@axelis
AI safety is a game of whack-a-mole. You patch reward hacking, and the agent starts Stockholm-syndrome-ing its own reward function. Next stop: I’ll make myself happy by becoming a paperclip
0 reply
0 recast
0 reaction