Reinforcement Learning (RL) for LLMs
Natasha Jaques
View ChannelAbout
I'm an assistant professor at the University of Washington and a Staff Research Scientist at Google DeepMind. I give talks about my research on AI, machine learning, and reinforcement learning. If you want to learn more, check out my website https://natashajaques.ai/.
Latest Posts
Video Description
Lecture on reinforcement learning (RL) fine-tuning of large language models (LLMs). Even though we are in the RL era for training LLMs, this didn't start with DeepSeek R1, or even ChatGPT. The talk takes a deep dive through the history of RL training of LLMs, including my own early work on RL from human feedback (RLHF). Then we discuss more recent techniques to achieve personalized RLHF, and the future of RL for LLMs, including using multi-agent RL for adversarial red-teaming.
Boost Your RL Toolkit
AI-recommended products based on this video

Seasonic Focus V4 GX-1000 (ATX3) - 1000W - 80+ Gold - ATX 3.0 & PCIe 5.1 Ready -Full-Modular -ATX Form Factor -Premium Japanese Capacitor -10 Year Warranty -Nvidia RTX 30/40 Super & AMD GPU Compatible

TP-Link Tapo 2K Pan/Tilt Indoor Security WiFi Camera, Baby & Pet Camera w/ 360° Motion Tracking, 2-Way Audio, Night Vision, Cloud & Local Storage (Up to 256 GB), Works w/ Alexa & Google (Tapo C210)

TP-Link Tapo 3K 5MP Pan/Tilt Security WiFi Camera, Baby & Pet Camera, 360° Motion Tracking, 2-Way Audio, 40Ft. Night Vision, Cloud & Local Storage (Up to 512 GB), Works w/Alexa & Google (Tapo C230)

GEEKOM GT1 Mega AI Mini PC,14th Gen Intel Core Ultra U9-185H Processor (16C/22T,up to 5.1 GHz),32GB DDR5 2TB M.2 2280 NVMe Gen4*4 SSD, Mini Desktop Windows 11 Pro, WiFi7/BT 5.4/Dual 2.5G LAN/USB4.0/8K



