Braeden Norris on Warpcast

Content pfp

0 reply

0 recast

0 reaction

Braeden Norris pfp

One of my favorite ways to qualitatively evaluate language models is through Caesar ciphers. I usually start with a shift of 13 (which is most popular) and then a less common shift. Claude 3 Opus is only able to solve the more common shift. (GPT-4 can only ocassionaly solve 13 in my experience)

1 reply

0 recast

0 reaction

Braeden Norris pfp

The other fun thing is to test the limitations of tokenization.

0 reply

0 recast

0 reaction