Qwen 3 Is Here… But Where's DeepSeek-R2?!

Qwen 3 Is Here… But Where’s DeepSeek-R2?!

AI Summary

Overview of Quen 3

Release Context: Recent launch by Alibaba, anticipated alongside Deepseek R2.

Model Name: Quen 3235B A22B, indicating:

235 billion parameters (large but not the largest)

22 billion active parameters (activated for specific tasks).

Key Features

Model Architecture: Mixture of experts model for efficiency, activating only specialized experts for given tasks.

Performance Benchmarks:

95% on Arena Hard, outperformed Deepseek R1, 01, and 03 Mini.

85% and 81% on Amy 24 and Amy 25 math benchmarks, just behind Gemini 2.5 Pro.

Outperformed Gemini 2.5 Pro in coding benchmarks (Code Forces).

Additional Versions: Includes a Quen 3 32B dense model and other variants ranging from 600 million to 30 billion parameters.

Open Source Availability: Enhances access for researchers and startups, impacting global AI innovation.

Hybrid Thinkers: Capable of both reasoning and non-reasoning tasks, boosting performance on math and coding.

Expanded Dataset: Pre-trained on about 36 trillion tokens (compared to 18 trillion for Quen 2.5), covering 119 languages.

Agentic Capabilities: Optimized for coding and improved support for Anthropic’s MCP (Model Context Protocol).

Conclusion

Quen 3 is a formidable addition to the AI landscape, competing closely with existing models like Gemini 2.5 Pro, with upcoming updates from Deepseek promising further developments.

ThirdBrAIn.tech

Explorer

Qwen 3 Is Here… But Where's DeepSeek-R2?!

Qwen 3 Is Here… But Where’s DeepSeek-R2?!

Overview of Quen 3

Key Features

Conclusion

Graph View

Table of Contents

Backlinks