base-seoul

#dailychallenge  Day10

Transformer and “Attention is all you need”
- The (Probably)most well-known article among AI articles. 
- So, what is a transformer?
    - A deep learning model introduced by Vaswani et al. in 2017.
    - Uses self-attention and positional encoding to process sequences efficiently.
    - Replaces recurrent models (RNNs, LSTMs) for NLP tasks, enabling parallel processing.
    - Forms the backbone of modern LLMs like GPT and BERT.
- Key ideas from the article “Attention is all you need”
    - Self-Attention - The model looks at all words simultaneously and figures out how they relate.
    - Multi-Head Attention - The model looks at words in different ways at once (e.g., meaning, position, importance).
    - Positional Encoding - Since words are processed at once, a small trick helps it remember word order.
    - Faster and More Accurate - Unlike older models (like RNNs), transformers don’t rely on past words to make sense of the sentence.