The Best Open-source OCR model | AI & ML Monthly
About
No channel description available.
Video Description
Quick link: Title refers to the dots.ocr OCR model - https://huggingface.co/rednote-hilab/dots.ocr Welcome to machine learning & AI monthly for July 2025. This is the video version of the newsletter I write every month which covers the latest and greatest (but not always the latest) in the world of AI and ML. Read the issues online: - AI/ML Monthly July 2025 (this video) — https://zerotomastery.io/blog/ai-and-machine-learning-monthly-newsletter-july-2025/ - AI/ML Monthly June 2025 — https://zerotomastery.io/blog/ai-and-machine-learning-monthly-newsletter-june-2025/ - AI/ML Monthly May 2025 — https://zerotomastery.io/blog/ai-and-machine-learning-monthly-newsletter-may-2025/ My links: Download Nutrify (my startup) - https://nutrify.app Download KeepTrack (my other startup) - https://keeptrack.app Personal website - https://www.mrdbourke.com My ML blog - https://learnml.io Read my novel Charlie Walks - https://www.charliewalks.com Courses I teach: Learn Hugging Face - https://learnhuggingface.com Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Timestamps: 0:00 - Intro 0:16 - My work — ZTM Object Detection with Hugging Face - https://www.learnhuggingface.com/notebooks/hugging_face_object_detection_tutorial 2:20 - From the Internet 2:21 - My favourite AI use case for AI is writing logs (Vicky Boykis) - https://newsletter.vickiboykis.com/archive/my-favorite-use-case-for-ai-is-writing-logs/ 11:40 - Cloudflare helps creators block AI scrapers - https://blog.cloudflare.com/content-independence-day-no-ai-crawl-without-compensation/ 15:03 - Google DeepMind embeds the entire Earth (AlphaEarth Foundations) - https://deepmind.google/discover/blog/alphaearth-foundations-helps-map-our-planet-in-unprecedented-detail/ 22:05 - Apple: How they built FastVLM - https://machinelearning.apple.com/research/fast-vision-language-models 26:48 - Google’s SensorLM (60M hours of sensor data) - https://research.google/blog/sensorlm-learning-the-language-of-wearable-sensors/ 32:29 - Daniel’s Open-Source AI of the Month 32:30 - Mistral releases Voxtral (ASR/translation/understanding) - https://mistral.ai/news/voxtral 40:35 - Allen AI’s FlexOlmo (collaborative MoE training) - https://allenai.org/blog/flexolmo 42:50 - Franca (open data/code/weights vision backbones) - https://github.com/valeoai/Franca 49:00 - Hugging Face SmolLM3 + full training recipe - https://huggingface.co/HuggingFaceTB/SmolLM3-3B 53:30 - MedGemma goes multimodal (text+image) - https://research.google/blog/medgemma-our-most-capable-open-models-for-health-ai-development/ 1:00:03 - Roboflow upgrades RF-DETR (real-time detection) - https://github.com/roboflow/rf-detr 1:03:55 - MM-GroundingDINO on Hugging Face (zero-shot OD) - https://huggingface.co/openmmlab-community/mm_grounding_dino_large_all 1:06:40 - Meta Perception Encoders: new variants - https://github.com/facebookresearch/perception_models 1:08:50 - Apple: pixel-level fallback to expand LLM vocab - https://machinelearning.apple.com/research/overcoming-vocabulary-constraints 1:15:10 - Z.ai GLM-4.5 / GLM-4.1V releases - https://z.ai/blog/glm-4.5 1:18:50 - Qwen3 updates + 480B coding model - https://qwenlm.github.io/blog/qwen3-coder/ 1:21:20 - dots.ocr (small, mighty OCR VLM) - https://huggingface.co/rednote-hilab/dots.ocr 1:26:30 - Releases 1:27:10 - Google’s genai-processors library - https://github.com/google-gemini/genai-processors 1:28:45 - ChatGPT Study Mode - https://openai.com/index/chatgpt-study-mode/ 1:31:29 - Videos 1:31:30 - John Carmack talk (Keen Technologies research) - https://youtu.be/iz9lUMSQBfY 1:34:10 - François Chollet on getting to AGI (ARC v1/v2) - https://youtu.be/5QcCeSsNRks 1:36:00 - Outro
OCR Mastery Essentials
AI-recommended products based on this video

MSI PRO B650-S WiFi ProSeries Motherboard (Supports AMD Ryzen 7000 Series Processors, AM5, DDR5, PCIe 4.0, M.2 Slots, SATA 6Gb/s, USB 3.2 Gen 2, HDMI/DP, Wi-Fi 6E, 2.5Gbps LAN, ATX)

MSI MAG B650 Tomahawk WiFi Gaming Motherboard (AMD AM5, ATX, DDR5, PCIe 4.0, M.2, SATA 6Gb/s, USB 3.2 Gen 2, HDMI/DP, Wi-Fi 6E, AMD Ryzen 7000 Series Desktop Processors)

Charger for Dell Inspiron 15 Charger, 65W Dell Inspiron 15 14 13 3000 5000 7000 Series, Dell Latitude 13 3301 3390 14 3400 3410 3420 3490 15 3500 3510 3520 3590, Dell Vostro, 4.5 * 3.0mm Power Cord



















