Meta’s Llama 4 is a beast (includes 10 million token context)
AI Summary
Llama 4 Overview
- Meta released Llama 4, their new open-source large language model family with three sizes: Behemoth, Maverick, and Scout.
- It features a significant improvement in context window size, with Scout having a 10 million token context window (compared to current standards of 128,000 tokens).
Model Details
- Llama for Scout
- Size: Small
- 17 billion active parameters, 109 billion total parameters
- Multimodal capabilities (text and images)
- Efficient resource use: can run on a single Nvidia H100 GPU
- Revolutionary context window handling up to 10 million tokens
- Llama for Maverick
- Size: Medium
- 400 billion total parameters, 128 experts
- Efficient with 17 billion active parameters
- Cost-efficient: 19 cents per million input/output tokens
- Competitive performance in coding and reasoning tasks
- Llama for Behemoth
- Size: Large
- 2 trillion total parameters, 288 billion active parameters
- Outperforms Gemini 2.0 Pro in benchmarks while still in preview training
Advantages of Open Source Models
- Developers have flexibility and control with open-source models over closed-source platforms.
- Opportunities for self-hosting and fine-tuning.
Resources to Try Llama 4
- Meta AI website: meta.ai
- Download links available on Hugging Face and direct requests for models if hardware permits.
- Grock (chat.grock.com) platform for testing the models interactively.
- Access through Meta’s owned applications like WhatsApp, Messenger, and Instagram.