Mistral 7B

Overview

Mistral 7B v0.1 is a 7.3-billion-parameter language model released by Mistral AI in September 2023. It was the first model released by the company and demonstrated that smaller, efficiently designed models could outperform much larger competitors. Released under Apache 2.0 license, offering free and unrestricted access.

Key Specifications

  • Parameters: 7.3 billion
  • Context Window: 8,192 tokens
  • License: Apache 2.0
  • Release Date: September 27, 2023
  • Architecture Type: Decoder-only Transformer

Architecture

Key Technical Parameters

  • Dimensions: 4,096
  • Layers: 32
  • Attention Heads: 32
  • Key-Value Heads: 8
  • Head Dimension: 128
  • Hidden Dimension: 14,336
  • Vocabulary Size: 32,000
  • Window Size: 4,096

Innovative Features

  1. Grouped-Query Attention (GQA): Enables faster inference by using 8 key-value heads instead of 32
  2. Sliding Window Attention (SWA): Allows efficient handling of sequences of arbitrary length with reduced inference cost
    • Each layer attends to previous 4,096 hidden states
    • Theoretical attention span of approximately 131K tokens at the last layer

Performance

Benchmark Comparison

vs LLaMA 2 13B: Mistral 7B outperforms across all evaluated benchmarks despite being almost half the size

vs LLaMA 1 34B: Surpasses on most benchmarks, particularly in:

  • Reasoning tasks
  • Mathematics
  • Code generation

Specific Capabilities

  • Commonsense Reasoning: Excellent performance on HellaSwag, Winogrande, PIQA, SIQA, OpenbookQA, ARC-Easy, ARC-Challenge, CommonsenseQA
  • World Knowledge: Competitive on NaturalQuestions and TriviaQA
  • Reading Comprehension: Superior performance on BoolQ and QuAC
  • Mathematics & Code: Strong performance despite smaller parameter count

Instruction-Following Variant

Mistral 7B Instruct: Fine-tuned version for instruction-following

  • Outperforms all 7B models on MT-Bench
  • Comparable to 13B chat models
  • Optimized for conversational interactions

Deployment

Integration Options

  • Cloud Platforms: AWS, GCP, Azure via vLLM inference server and SkyPilot
  • Local Deployment: Reference implementation for local hosting
  • Hugging Face: Streamlined integration with Hugging Face ecosystem
  • Ollama: Available for local deployment

Significance

Mistral 7B demonstrated that:

  • Smaller, well-designed models can compete with much larger ones
  • Open-source models can achieve state-of-the-art performance
  • Efficient architectures (GQA, SWA) enable practical deployment
  • European AI companies can compete with American tech giants

The model’s success established Mistral AI as a major player in the LLM space and paved the way for subsequent releases like Mixtral 8x7B.

Resources


Status: OK
Last Updated: 2025-12-25
Review: Completed and approved for publication