Introducing the Qwen 3 Family



AI Summary

The Quen team has released Quen 3, a comprehensive family of models including:

  • Mixture of experts models: 235 billion parameters (22 billion active) and 30 billion parameters (3 billion active).
  • Dense models ranging from 6 billion to 32 billion parameters.

Key Features:

  • Hybrid Models: Supports adjustable thinking reasons, enhancing performance based on token budget for reasoning.
  • Multilingual Support: Offers 119 languages and dialects, accommodating diverse linguistic communities.
  • Enhanced Tool Use: Improved capability for models to call various tools, increasing their functional versatility.
  • Training Improvements: The training tokens have doubled to approximately 36 trillion, incorporating extensive multilingual training and synthetic data for specific tasks.

Training Process:

  1. Pre-training with 30 trillion tokens on general web data.
  2. Incorporation of knowledge-intensive data (e.g., STEM) with synthetic data.
  3. Supervised fine-tuning and reinforcement learning for varied reasoning abilities.

Testing and Access:

  • Available for testing at chat.quen.ai. Users can experiment with different models and settings.
  • Further updates and features expected in future releases.