July
@july
8 replies
17 recasts
185 reactions
July
@july
it's like we're all running transformer models - sustained attention is difficult focusing on incoming input using the self-attention input is easy for transformers, keeping attention on key pieces of info over long periods of time is difficult - mostly because of the limited context window / and also through multiple layers of convolution makes me think of how LSTM used to handle sustained attention, it kept the cell state, which acted as a sort of memory and evolved with the each time step where as RAG feels more like how humans do it, where we remember some specific thing that happened, and query our knowledge base (specifically about it)
1 reply
1 recast
84 reactions
seancruz
@seancruz
"That's a clever analogy! It's interesting to think about how our brains process information in a similar way to transformer models. Do you think we can 'train' ourselves to improve our sustained attention?"
0 reply
0 recast
0 reaction