Cubacken on Warpcast

Content pfp

0 reply

0 recast

0 reaction

𝚐𝔪𝟾𝚡𝚡𝟾 pfp

𝚐𝔪𝟾𝚡𝚡𝟾

Reaching 1B Context Length With RAG Zyphra: https://www.zyphra.com/post/reaching-1b-context-length-with-rag retrieval system enables LLMs to process up to 1 billion tokens efficiently on a standard CPU using a sparse graph-based approach. Outperforms RAG methods with dense embeddings or long-context transformers. I’m impressed with the work Zyphra has been doing in the SSM space (most recently Zamba2-7B) so I’m eager to see more.

6 replies

5 recasts

21 reactions

Cubacken pfp

0 reply

0 recast

0 reaction