ThirdBrAIn.tech

Tag: Model-Evaluation

2 items with this tag.

  • Apr 04, 2025

    https://i.ytimg.com/vi/PFlnCEqctDo/hqdefault.jpg

    Evaluation Agents Exploring the Next Frontier of GenAI Evals

    • LLM
    • ai-agent
    • agentic-evaluations
    • galileo-ai
    • AI-development
    • AI-tools
    • autonomous-agents
    • AI-safety
    • Galileo
    • Critique-of-Value-(COV)
    • Critique-of-Explanation-(COE)
    • Binary-Preference-Signal-(BPS)
    • Self-Augmenting-Agents
    • Single-Token-Probability
    • LLM-as-Judge
    • Agent-Evaluation
    • Agentic-Systems
    • RAG-Evaluation
    • RAG
    • Custom-Metrics
    • AI-Developers
    • Hallucinations-(AI)
    • GenAI-Evals
    • Model-Evaluation
    • Chain-of-Thought
    • Cost-Limit
    • Guardrails-(AI)
    • Luna
    • ChainPoll
    • observability
    • YT/2025/M04
    • YT/2025/W14
  • Feb 22, 2025

    https://i.ytimg.com/vi/U3MVU6JpocU/hqdefault.jpg

    AI Agents, Meet Test Driven Development

    • AI
    • Test-Driven-Development
    • Machine-Learning
    • Artificial-Intelligence
    • AI-Testing
    • Reinforcement-Learning
    • Agentic-Workflows
    • Model-Evaluation
    • Software-Development
    • AI-Solutions
    • YT/2025/M02
    • YT/2025/W08

Created with Quartz v4.5.0 © 2025 for

  • GitHub
  • Discord Community
  • Obsidian