Charlie Harrington pfp
Charlie Harrington
@whatrocks
I know I’m behind in my Karpathy YouTube videos, but I’m thinking about offline “brain-like”/ memories storage for LLMs to escape token or context window limits. Each convo can be compressed / distilled and dumped to offline memory with L1/L2 / disk like access. Is this obvious or already happening?
1 reply
0 recast
14 reactions

Brenner pfp
Brenner
@brenner.eth
RAG, but I think there must be a better way than this
1 reply
0 recast
6 reactions

Charlie Harrington pfp
Charlie Harrington
@whatrocks
Interesting. Had only been thinking about RAG as input. Not as storage for output in a loop. Cool
0 reply
0 recast
1 reaction