Nan Wang's Highlights on ' Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch ' | Glasp