Transformers explained | The architecture behind LLMs
AI Coffee Break with Letitia
View ChannelAbout
Lighthearted bite-sized ML videos for your AI Coffee Break! πΊ Mostly videos about the latest technical advancements in AI, such as large language models (LLMs), text-to-image models and everything cool in natural language processing, computer vision, etc.! We try to post twice a month! π€ But you know, Letitia has a full-time job, and Ms. Coffee Bean tends to enjoy time off to go out and have fun. π Disclaimer: Opinions expressed are solely my own and do not express the views or opinions of my employer. Impressum: https://aicoffeebreak.com/impressum.html
Latest Posts
Video Description
All you need to know about the transformer architecture: How to structure the inputs, attention (Queries, Keys, Values), positional embeddings, residual connections. Bonus: an overview of the difference between Recurrent Neural Networks (RNNs) and transformers. 9:19 Order of multiplication should be the opposite: x1(vector) * Wq(matrix) = q1(vector). Otherwise we do not get the 1x3 dimensionality at the end. Sorry for messing up the animation! Check this out for a super cool transformer visualisation! π https://poloclub.github.io/transformer-explainer/ β‘οΈ AI Coffee Break Merch! ποΈ https://aicoffeebreak.creator-spring.com/ Outline: 00:00 Transformers explained 00:47 Text inputs 02:29 Image inputs 03:57 Next word prediction / Classification 06:08 The transformer layer: 1. MLP sublayer 06:47 2. Attention explained 07:57 Attention vs. self-attention 08:35 Queries, Keys, Values 09:19 Order of multiplication should be the opposite: x1(vector) * Wq(matrix) = q1(vector). 11:26 Multi-head attention 13:04 Attention scales quadratically 13:53 Positional embeddings 15:11 Residual connections and Normalization Layers 17:09 Masked Language Modelling 17:59 Difference to RNNs Thanks to our Patrons who support us in Tier 2, 3, 4: π Dres. Trost GbR, Siltax, Vignesh Valliappan, @Mutual_Information , Kshitij Our old Transformer explained πΊ video: https://youtu.be/FWFA4DGuzSc πΊ Tokenization explained: https://youtu.be/D8j1c4NJRfo πΊ Word embeddings: https://youtu.be/YkK5IKgxp-c π½οΈ Replacing Self-Attention: https://www.youtube.com/playlist?list=PLpZBeKTZRGPM8PNRyv6fNMcAW3dMDq_A- π½οΈ Position embeddings: https://www.youtube.com/playlist?list=PLpZBeKTZRGPOQtbCIES_0hAvwukcs-y-x @SerranoAcademy Transformer series: https://www.youtube.com/watch?v=OxCpWwDCDFQ&list=PLs8w1Cdi-zva4fwKkl9EK13siFvL9Wewf π Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Εukasz Kaiser, and Illia Polosukhin. "Attention is all you need." Advances in neural information processing systems 30 (2017). ββββββββββββββββββββββββββ π₯ Optionally, pay us a coffee to help with our Coffee Bean production! β Patreon: https://www.patreon.com/AICoffeeBreak Ko-fi: https://ko-fi.com/aicoffeebreak ββββββββββββββββββββββββββ π Links: AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community Twitter: https://twitter.com/AICoffeeBreak Reddit: https://www.reddit.com/r/AICoffeeBreak/ YouTube: https://www.youtube.com/AICoffeeBreak #AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #researchβ Music π΅ : Sunset n Beachz - Ofshane Video editing: Nils Trost
Master LLM Architecture Today
AI-recommended products based on this video


