747: Technical Intro to Transformers and LLMs — with Kirill Eremenko

Super Data Science: ML & AI Podcast with Jon Krohn • January 9, 2024

Super Data Science: ML & AI Podcast with Jon Krohn

About

The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on Super Data Science, the most listened-to podcast in the industry. In lighthearted conversation with renowned guests, Jon cuts through hype to fuel your professional impact. Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy. We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.

Latest Posts

PT4M

How LLMs Are Actually Trained: Pre-Training vs. Post-Training Explained (with Julien Launay)

Super Data Science: ML & AI Podcast with Jon Krohn8 months ago

3843

PT4M

867: LLMs and Agents Are Overhyped — with Dr. Andriy Burkov

Super Data Science: ML & AI Podcast with Jon Krohn1 year ago

4747

PT4M

859: BAML: The Programming Language for AI — with Vaibhav Gupta

Super Data Science: ML & AI Podcast with Jon Krohn1 year ago

1170

PT4M

827: Polars: Past, Present and Future — with Polars Creator Ritchie Vink

Super Data Science: ML & AI Podcast with Jon Krohn1 year ago

3369

Video Description

#LLMs #TransformerArchitecture #AttentionMechanism http://www.superdatascience.com/llmcourse Attention and transformers in LLMs, the five stages of data processing, and a brand-new Large Language Models A-Z course: Kirill Eremenko joins host @JonKrohnLearns to explore what goes into well-crafted LLMs, what makes Transformers so powerful, and how to succeed as a data scientist in this new age of generative AI. This episode is brought to you by Intel and HPE Ezmeral Software Solutions (https://bit.ly/hpeyt), and by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://jonkrohn.com/podcast for sponsorship information. In this episode you will learn: • [00:00:00] Introduction • [00:06:58] Supply and demand in AI recruitment • [00:14:06] Kirill and Hadelin's new course on LLMs, “Large Language Models (LLMs), Transformers & GPT A-Z” • [00:18:14] The learning difficulty in understanding LLMs • [00:20:28] The basics of LLMs • [00:34:58] The five building blocks of transformer architecture • [00:42:38] 1: Input embedding • [00:49:13] 2: Positional encoding • [00:52:32] 3: Attention mechanism • [01:14:44] 4: Feedforward neural network • [01:17:43] 5: Linear transformation and softmax • [01:27:39] Inference vs training time • [01:47:49] Why transformers are so powerful Additional materials: https://www.superdatascience.com/747