Building makemore Part 5: Building a WaveNet
Andrej Karpathy
@andrejkarpathyAbout
Latest Posts
Video Description
We take the 2-layer MLP from previous video and make it deeper with a tree-like structure, arriving at a convolutional neural network architecture similar to the WaveNet (2016) from DeepMind. In the WaveNet paper, the same hierarchical architecture is implemented more efficiently using causal dilated convolutions (not yet covered). Along the way we get a better sense of torch.nn and what it is and how it works under the hood, and what a typical deep learning development process looks like (a lot of reading of documentation, keeping track of multidimensional tensor shapes, moving between jupyter notebooks and repository code, ...). Links: - makemore on github: https://github.com/karpathy/makemore - jupyter notebook I built in this video: https://github.com/karpathy/nn-zero-to-hero/blob/master/lectures/makemore/makemore_part5_cnn1.ipynb - collab notebook: https://colab.research.google.com/drive/1CXVEmCO_7r7WYZGb5qnjfyxTvQa13g5X?usp=sharing - my website: https://karpathy.ai - my twitter: https://twitter.com/karpathy - our Discord channel: https://discord.gg/3zy8kqD9Cp Supplementary links: - WaveNet 2016 from DeepMind https://arxiv.org/abs/1609.03499 - Bengio et al. 2003 MLP LM https://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf Chapters: intro 00:00:00 intro 00:01:40 starter code walkthrough 00:06:56 let’s fix the learning rate plot 00:09:16 pytorchifying our code: layers, containers, torch.nn, fun bugs implementing wavenet 00:17:11 overview: WaveNet 00:19:33 dataset bump the context size to 8 00:19:55 re-running baseline code on block_size 8 00:21:36 implementing WaveNet 00:37:41 training the WaveNet: first pass 00:38:50 fixing batchnorm1d bug 00:45:21 re-training WaveNet with bug fix 00:46:07 scaling up our WaveNet conclusions 00:46:58 experimental harness 00:47:44 WaveNet but with “dilated causal convolutions” 00:51:34 torch.nn 00:52:28 the development process of building deep neural nets 00:54:17 going forward 00:55:26 improve on my loss! how far can we improve a WaveNet on this data?
You May Also Like
Essential AI Voice Synthesis Tools
AI-recommended products based on this video

TP-Link Tapo 2K Pan/Tilt Indoor Security WiFi Camera, Baby & Pet Camera w/ 360° Motion Tracking, 2-Way Audio, Night Vision, Cloud & Local Storage (Up to 256 GB), Works w/ Alexa & Google (Tapo C210)

TP-Link Tapo 3K 5MP Pan/Tilt Security WiFi Camera, Baby & Pet Camera, 360° Motion Tracking, 2-Way Audio, 40Ft. Night Vision, Cloud & Local Storage (Up to 512 GB), Works w/Alexa & Google (Tapo C230)

NEW POW 65W 18V-20V Universal Ultrathin AC Adapter Laptop Charger Power Supply for HP Lenovo Dell Asus Acer IBM Toshiba Samsung Sony Fujitsu Gateway Compatible Models Cord (15 Tips,Black)

Battery Cover for Xbox Series S, Cheap Replacement Back Shell Door Lid Repair Part to Microsoft Xbox Series X Controller, Black Batteries Port Cap Outside Case for New Xbox Core Wireless Remote,4 Pack

Lenovo IdeaPad Duet 3i 10.3" Touchscreen Intel Celeron N4020, 2-in-1 Laptop, Windows 11 Home S Mode + Includes 1-Year Microsoft 365 Personal - 82AT00KJCC

Google Pixel Buds Pro 2 - Noise Canceling Earbuds - Up to 31 Hour Battery Life with Charging Case - Bluetooth Headphones - Compatible with Android - Hazel

Deeyaple USB C to Aux, 4FT/1.2M, Type C to 3.5mm Audio Cable Headphone Jack Cable for Car Mobile Phone, iPhone 16 15, iPad Pro, Samsung Galaxy S24 S23 S2010, Google Pixel,Oneplus Grey (1)





![[1hr Talk] Intro to Large Language Models](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/zjkBMFhNj_g/hqdefault.jpg)










