DeepSeek R1 0528 Qwen3 8B - Small Upgraded Student Model - Install and Test Locally
AI Summary
In this video, the presenter discusses the recent minor upgrade of the Deepseek R1 AI model, focusing on the distilled version called Quen 38 with 8 billion parameters. This distilled model utilizes knowledge distillation, transferring reasoning capabilities from a larger model to a more efficient one. The presenter walks through the installation process on an Ubuntu system using an Nvidia RTX A6000 GPU. Various performance benchmarks are tested, showcasing its capabilities in logical reasoning and programming tasks. The presenter highlights the architectural improvements over the previous models, discussing its applications in math and the reasoning tasks. The video also contrasts the reasoning approach of the distilled model with traditional large language models, emphasizing its analytical capabilities. The session concludes by inviting viewers to subscribe, and visit the video description for more resources and offers.