I know I’m behind in my Karpathy YouTube videos, but I’m thinking about offline “brain-like”/ memories storage for LLMs to escape token or context window limits. Each convo can be compressed / distilled and dumped to offline memory with L1/L2 / disk like access. Is this obvious or already happening?

I dig old computers and used bookstores. 

Decentralized open source consciousness | Values-aligned communities | citizen of Black Rock City

Interesting. Had only been thinking about RAG as input. Not as storage for output in a loop. Cool

RAG, but I think there must be a better way than this