shazow pfp
shazow
@shazow.eth
The coding tasks that are easy with AI work for the same reason that LLMs work at all: If we can arrange the context in a way that the most likely next step is a valid one, then we're going to get great results! On the other hand, if we have tasks where it's difficult to manufacture that kind of "path dependence" for valid outputs, AI is still unable to help. There's a lesson about life here as well: Setting ourselves up for success is extremely powerful. Being left to flail in an amorphous space of possibility can be paralyzing.
1 reply
0 recast
5 reactions

shazow pfp
shazow
@shazow.eth
This is how "attention" works in GPT models: The tokens in the context window get re-weighed to create stronger path dependence around what is relevant. Unlike a traditional Markov Chain where the relevance of the preceding sequence of token is some fixed function, attention is effectively highlighting/reordering tokens for better results.
2 replies
0 recast
1 reaction

Daniel Fernandes pfp
Daniel Fernandes
@dfern.eth
Great explanation, I hadn't considered comparing to Markov chains as an intuition pump for why LLMs work.
1 reply
0 recast
1 reaction

shazow pfp
shazow
@shazow.eth
🙏 I have an affinity for Markov Chains, totally not because Andrey Markov is my namesake lol.
0 reply
0 recast
1 reaction