Fine-Tuning Text Embeddings For Domain-specific Search (w/ Python)

Shaw Talebi • January 22, 2025
Video Thumbnail

About

No channel description available.

Video Description

📈 Transform Your Business with AI: https://aibuilder.academy/yt/hOLBrIjRAj4 🤓 Get the (free) Claude Code Course: https://aibuilder.academy/courses/yt/hOLBrIjRAj4 In this video, I walk through how to fine-tune a text embedding model for domain adaptation using the Sentence Transformers Python library. Resources: 📰 Blog: https://shawhin.medium.com/fine-tuning-text-embeddings-f913b882b11c?source=friends_link&sk=41468a7c4b3c40d7edb714489889e028 💻 GitHub Repo: https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/fine-tuning-embeddings 🤗 Model: https://huggingface.co/shawhin/distilroberta-ai-job-embeddings 💿 Dataset: https://huggingface.co/datasets/shawhin/ai-job-embedding-finetuning References: [1] https://youtu.be/Ylz779Op9Pw [2] https://youtu.be/sNa_uiqSlJo [3] https://youtu.be/4QHg8Ix8WWQ [4] https://sbert.net/docs/sentence_transformer/training_overview.html [5] https://sbert.net/docs/sentence_transformer/training_overview.html#best-base-embedding-models [6] https://sbert.net/docs/sentence_transformer/pretrained_models.html#semantic-search-models [7] https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss Intro - 0:00 RAG - 0:48 Problem with Vector Search - 2:25 Fine-tuning - 3:49 Why fine-tune? - 4:43 5 Steps for Fine-tuning Embeddings - 6:23 Example: Fine-tuning Embeddings on AI Jobs - 6:55 Step 1: Gather Positive (and Negative) Pairs - 7:53 Step 2: Pick a Pre-trained Model - 12:50 Step 3: Pick a Loss Function - 14:18 Step 4: Fine-tune the Model - 15:57 Step 5: Evaluate the Model - 18:00 What's Next? - 19:13