LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU
Umar Jamil
View ChannelAbout
I'm a Machine Learning Engineer from Milan, Italy, teaching complex deep learning and machine learning concepts to my cat, 奥利奥. 我也会中文.
Latest Posts
Video Description
Full explanation of the LLaMA 1 and LLaMA 2 model from Meta, including Rotary Positional Embeddings, RMS Normalization, Multi-Query Attention, KV-Cache, Grouped Multi-Query Attention (GQA), the SwiGLU Activation function and more! I also review the Transformer concepts that are needed to understand LLaMA and everything is visually explained! As always, the PDF slides are freely available on GitHub: https://github.com/hkproj/pytorch-llama-notes/ Chapters 00:00:00 - Introduction 00:02:20 - Transformer vs LLaMA 00:05:20 - LLaMA 1 00:06:22 - LLaMA 2 00:06:59 - Input Embeddings 00:08:52 - Normalization & RMSNorm 00:24:31 - Rotary Positional Embeddings 00:37:19 - Review of Self-Attention 00:40:22 - KV Cache 00:54:00 - Grouped Multi-Query Attention 01:04:07 - SwiGLU Activation function
Essential AI Training Tools
AI-recommended products based on this video

Seasonic Focus V4 GX-1000 (ATX3) - 1000W - 80+ Gold - ATX 3.0 & PCIe 5.1 Ready -Full-Modular -ATX Form Factor -Premium Japanese Capacitor -10 Year Warranty -Nvidia RTX 30/40 Super & AMD GPU Compatible

HP Victus 15.6" 144Hz FHD Gaming Laptop, Intel i5-12450H, 32GB RAM, 1TB PCIe SSD, NVIDIA GeForce RTX 3050, Backlit Keyboard, HD Webcam, Win 11, Blue, 256GB Docking Station Set

Acer Swift X 14" FHD Laptop, AMD Ryzen 7 5825U, NVIDIA GeForce RTX 3050Ti, 16GB RAM, 512GB SSD, Windows 11 Home

AtomMan G7 Pt Mini PC AMD Ryzen 9 7945HX(16C/32T, up to 5.4GHz) 32GB DDR5 1TB PCIe4.0 SSD Micro Computer, HDMI+DP+USB-C Output, 2.5G LAN, WiFi7, BT5.4, 4xUSB AMD Radeon RX 7600M XT Graphics Gaming PC

STGAubron Gaming Desktop PC, AMD Athlon 3000G 3.5G, Radeon RX 580 16G GDDR5, 16G RAM, 512G SSD, 600M WiFi, BT 5.0, RGB Fan x4, Windows 11 Home

Corsair RM1000e Fully Modular Low-Noise ATX Power Supply - Dual EPS12V Connectors - 105°C-Rated Capacitors - 80 Plus Gold Efficiency - Modern Standby Support - Black

CORSAIR iCUE Link XD5 RGB Elite LCD Pump-Reservoir Unit - D5 PWM Pump - 480x480 IPS LCD Screen - 22 Addressable RGB LEDs - 440ml Nylon Reservoir - White

CORSAIR iCUE Link XC7 RGB Elite CPU Water Block - Transparent Flow Chamber - 24 RGB LEDs - Fits Intel® LGA 1700, AMD® AM5 and Older - White

CORSAIR Hydro X Series iCUE Link XH405i Custom Cooling Kit – Hardline Water Cooling Loop – XC7 Elite CPU Water Block – XD5 Elite D5 Pump Res – XR5 360mm Radiator – 3X QX120 RGB Fans

Corsair MP600 Elite 4TB M.2 PCIe Gen4 x4 NVMe SSD for PS5 – Included Heatsink – M.2 2280 – Up to 7,000MB/sec Sequential Read – High-Density 3D TLC NAND – White

New SteelSeries Arctis Nova Pro for Xbox Multi-System Gaming Headset - Premium Hi-Fi Drivers - Hi-Res Audio - 360° Spatial - GameDAC Gen 2 - Quad-DAC - ClearCast Gen 2 Mic - Xbox, PC, PS5/PS4, Switch







![How Attention Got So Efficient [GQA/MLA/DSA]](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/Y-o545eYjXM/hqdefault.jpg)




![BERT explained: Training, Inference, BERT vs GPT/LLamA, Fine tuning, [CLS] token](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/90mGPxR2GgY/hqdefault.jpg)




![How Rotary Position Embedding Supercharges Modern LLMs [RoPE]](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/SMBkImDWOyQ/hqdefault.jpg)





