Brokk

Brokk: Under the Hood

1. Introduction

  • Brokk is an open source IDE designed for supervising AI coders.
  • Focuses on context management for long-form coding in English, not just tab-completion.
  • GitHub repository.

2. Quick Context: JLama, MiniLM-L6-v2, and Gemini 2.0 Flash Lite

  • Quick Context Suggestions
    • Shown as blue suggestions while typing/dictating instructions.
    • Can be added to Workspace via right-click.
  • Latency Minimization
    • Uses Gemini 2.0 Flash Lite for speed.
    • GPT 4.1 nano tested but found slower.
    • 400ms debounce is standard but not ideal for programmers’ workflow.
    • High frequency of calls led to auto-blacklisting by Gemini.
  • JLama Integration
    • Java-based inference engine (like llama.cpp).
    • Uses MiniLM-L6-v2 for semantic embeddings (chosen for speed and size).
    • JLama checks if new instructions are semantically distinct before making LLM requests.

3. Deep Scan and Agentic Search: Brute Force and Tool Calls

  • Deep Scan
    • Suggests additional files needed for a task using brute force (entire summarized project).
    • Uses a smarter, slower Edit model LLM; run only on user request.
    • Provides more accurate recommendations than Quick Context.
    • Recommends whether to edit or summarize files.
  • Quick Context vs Deep Scan
    • Both are single-turn inference (one-shot recommendations).
  • Agentic Search
    • Multi-turn process where LLM uses tools to explore codebase.
    • Useful when single-turn methods are insufficient (e.g., large projects, missing details).
    • Slower but yields high-quality results.

4. Code Intelligence: Joern and Tree-sitter

  • Joern as Code Intelligence Engine
    • Chosen after evaluating alternatives (CodeQL, SciTools, SonarSource, SCIP, Semgrep, Tree Sitter, LSP).
    • Criteria: OSS, no special build integration, type inference, speed on large codebases.
    • Downsides: JVM-based (Scala), less uniform API across languages.
  • Tree-sitter Integration
    • Added in Brokk 0.9 for partial support of non-Java languages (Python, JavaScript, C#).
    • Used mainly for summarization to reduce token usage (~10x reduction).
    • Utilizes tree-sitter-ng Java wrapper.

5. Conclusion / Wrapping Up

  • Brokk empowers users to supervise AI coding while handling editing/syntax details.
  • JLama ensures fast suggestions; Joern provides codebase insight; open-source nature allows extensibility.
  • Designed for large enterprise codebases, not just demos or small projects.
  • Try Brokk.

  • Product: Home, Pricing
  • Resources: Blog
  • Legal: Privacy Policy, Terms of Service