Mistral 7B
Overview
Mistral 7B v0.1 is a 7.3-billion-parameter language model released by Mistral AI in September 2023. It was the first model released by the company and demonstrated that smaller, efficiently designed models could outperform much larger competitors. Released under Apache 2.0 license, offering free and unrestricted access.
Key Specifications
- Parameters: 7.3 billion
- Context Window: 8,192 tokens
- License: Apache 2.0
- Release Date: September 27, 2023
- Architecture Type: Decoder-only Transformer
Architecture
Key Technical Parameters
- Dimensions: 4,096
- Layers: 32
- Attention Heads: 32
- Key-Value Heads: 8
- Head Dimension: 128
- Hidden Dimension: 14,336
- Vocabulary Size: 32,000
- Window Size: 4,096
Innovative Features
- Grouped-Query Attention (GQA): Enables faster inference by using 8 key-value heads instead of 32
- Sliding Window Attention (SWA): Allows efficient handling of sequences of arbitrary length with reduced inference cost
- Each layer attends to previous 4,096 hidden states
- Theoretical attention span of approximately 131K tokens at the last layer
Performance
Benchmark Comparison
vs LLaMA 2 13B: Mistral 7B outperforms across all evaluated benchmarks despite being almost half the size
vs LLaMA 1 34B: Surpasses on most benchmarks, particularly in:
- Reasoning tasks
- Mathematics
- Code generation
Specific Capabilities
- Commonsense Reasoning: Excellent performance on HellaSwag, Winogrande, PIQA, SIQA, OpenbookQA, ARC-Easy, ARC-Challenge, CommonsenseQA
- World Knowledge: Competitive on NaturalQuestions and TriviaQA
- Reading Comprehension: Superior performance on BoolQ and QuAC
- Mathematics & Code: Strong performance despite smaller parameter count
Instruction-Following Variant
Mistral 7B Instruct: Fine-tuned version for instruction-following
- Outperforms all 7B models on MT-Bench
- Comparable to 13B chat models
- Optimized for conversational interactions
Deployment
Integration Options
- Cloud Platforms: AWS, GCP, Azure via vLLM inference server and SkyPilot
- Local Deployment: Reference implementation for local hosting
- Hugging Face: Streamlined integration with Hugging Face ecosystem
- Ollama: Available for local deployment
Significance
Mistral 7B demonstrated that:
- Smaller, well-designed models can compete with much larger ones
- Open-source models can achieve state-of-the-art performance
- Efficient architectures (GQA, SWA) enable practical deployment
- European AI companies can compete with American tech giants
The model’s success established Mistral AI as a major player in the LLM space and paved the way for subsequent releases like Mixtral 8x7B.
Resources
- Official Announcement: https://mistral.ai/news/announcing-mistral-7b
- Paper: Mistral 7B (arXiv:2310.06825)
- Developer: Mistral AI
- Related Models: Mixtral 8x7B, Mistral Large
Status: OK
Last Updated: 2025-12-25
Review: Completed and approved for publication