Reinforcement Learning (RL) for LLMs

Natasha Jaques • March 12, 2025

Natasha Jaques

About

I'm an assistant professor at the University of Washington and a Staff Research Scientist at Google DeepMind. I give talks about my research on AI, machine learning, and reinforcement learning. If you want to learn more, check out my website https://natashajaques.ai/.

Latest Posts

5 - Deep Multi agent RL

Natasha Jaques

4 - Learning from humans beyond LLMs

Natasha Jaques

2 - Deep RL and RL post-training intro

Natasha Jaques

Video Description

Lecture on reinforcement learning (RL) fine-tuning of large language models (LLMs). Even though we are in the RL era for training LLMs, this didn't start with DeepSeek R1, or even ChatGPT. The talk takes a deep dive through the history of RL training of LLMs, including my own early work on RL from human feedback (RLHF). Then we discuss more recent techniques to achieve personalized RLHF, and the future of RL for LLMs, including using multi-agent RL for adversarial red-teaming.

Reinforcement Learning (RL) for LLMs

Natasha Jaques

About

Latest Posts

5 - Deep Multi agent RL

4 - Learning from humans beyond LLMs

2 - Deep RL and RL post-training intro

Video Description

You May Also Like

Boost Your RL Toolkit

MSI Thin 15 Gaming Laptop, 15.6" 144 Hz IPS Display, AMD Ryzen 9 8945HS, NVIDIA RTX 4060 8GB GDDR6, 32 GB DDR5, 2 TB SSD, with Windows 11 Pro, Office Pro Lifetime License, Mouse, USB C Flash Drive

Skytech Rampage Gaming PC, Intel i7 14700K 3.4 GHz (5.5GHz Turbo Boost), NVIDIA RTX 4070 Super 12GB GDDR6X, 2TB SSD, 32GB DDR5 RAM 5600 RGB, 750W Gold PSU, 360mm AIO, Wi-Fi, Win 11 Home

acer Nitro V 15 Gaming Laptop, 15.6&quot; 144Hz FHD Display, Intel 10-Core i7-13620H, NVIDIA GeForce RTX 4060, 64 GB DDR5 RAM, 4 TB SSD, Backlit Keyboard, Microsoft Office Lifetime License, Windows 11 Pro

acer Nitro V 15 Gaming Laptop, Microsoft Office Lifetime License &amp; Windows 11 Pro, 15.6&quot; 144Hz FHD Display, Intel 10-Core i7-13620H, NVIDIA GeForce RTX 4060, 64GB DDR5 RAM, 4TB SSD, Backlit Keyboard

TP-Link Tapo 1080P Indoor Security Camera for Baby Monitor, Dog Camera w/Motion Detection, 2-Way Audio Siren, Night Vision, Cloud & SD Card Storage, Works w/Alexa & Google Home (Tapo C100)

Loading...

acer Nitro V 15 Gaming Laptop, 15.6" 144Hz FHD Display, Intel 10-Core i7-13620H, NVIDIA GeForce RTX 4060, 64 GB DDR5 RAM, 4 TB SSD, Backlit Keyboard, Microsoft Office Lifetime License, Windows 11 Pro

acer Nitro V 15 Gaming Laptop, Microsoft Office Lifetime License & Windows 11 Pro, 15.6" 144Hz FHD Display, Intel 10-Core i7-13620H, NVIDIA GeForce RTX 4060, 64GB DDR5 RAM, 4TB SSD, Backlit Keyboard