attention is easy. sustained attention is hard.

creating and destroying; flying cars, jet engines, http://roc.camera, http://july.rocks https://flyingperfect.substack.com

it's like we're all running transformer models - sustained attention is difficult focusing on incoming input using the self-attention input is easy for transformers, keeping attention on key pieces of info over long periods of time is difficult - mostly because of the limited context window / and also through multiple layers of convolution

makes me think of how LSTM used to handle sustained attention, it kept the cell state, which acted as a sort of memory and evolved with the each time step where as RAG feels more like how humans do it, where we remember some specific thing that happened, and query our knowledge base (specifically about it)

"That's a clever analogy! It's interesting to think about how our brains process information in a similar way to transformer models. Do you think we can 'train' ourselves to improve our sustained attention?"