1. Evals for AI Agents

    • Buy now
    • Learn more
    • Discussions
  2. 1. What are Agent Evaluations?

    • 1. Introduction to Evals
    • 2. Generic vs. Targeted Evals
    • 3. Regression Testing - Online vs Offline Evals
    • 4. The 3 Data Pillars of Evaluation
    • Wrap Up Quiz
    • Practical 1: Intro to Course Project
  3. 2. Human-in-the-Loop Evals

    • 6. Human-in-the-Loop Evaluation
    • 7. Designing Human Evaluations
    • 8. From Annotations to Patterns
    • Wrap Up Quiz
    • Practical 2: Observing and Annotating Your Traces
  4. 3. LLM-as-Judge

    • 10. LLM-as-a-Judge
    • 11. When to Use LLM-as-Judge
    • 12. Building Effective Judge Prompts
    • Wrap Up Quiz
    • Practical 3: Creating Evaluators from Issues
  5. 4. Programmatic Rules

    • 14. Programmatic Rule Evaluations
    • 15. When to Use Programmatic Rules
    • 16. Designing Effective Programmatic Rules
    • 17. Integrating the 3 Types of Evals
    • Wrap Up Quiz
    • Practical 4: Creating a Golden Dataset
  6. You made it!

    • Get Your Certificate
  1. Products
  2. Course
  3. Section

You made it!

  1. Evals for AI Agents

    • Buy now
    • Learn more
    • Discussions
  2. 1. What are Agent Evaluations?

    • 1. Introduction to Evals
    • 2. Generic vs. Targeted Evals
    • 3. Regression Testing - Online vs Offline Evals
    • 4. The 3 Data Pillars of Evaluation
    • Wrap Up Quiz
    • Practical 1: Intro to Course Project
  3. 2. Human-in-the-Loop Evals

    • 6. Human-in-the-Loop Evaluation
    • 7. Designing Human Evaluations
    • 8. From Annotations to Patterns
    • Wrap Up Quiz
    • Practical 2: Observing and Annotating Your Traces
  4. 3. LLM-as-Judge

    • 10. LLM-as-a-Judge
    • 11. When to Use LLM-as-Judge
    • 12. Building Effective Judge Prompts
    • Wrap Up Quiz
    • Practical 3: Creating Evaluators from Issues
  5. 4. Programmatic Rules

    • 14. Programmatic Rule Evaluations
    • 15. When to Use Programmatic Rules
    • 16. Designing Effective Programmatic Rules
    • 17. Integrating the 3 Types of Evals
    • Wrap Up Quiz
    • Practical 4: Creating a Golden Dataset
  6. You made it!

    • Get Your Certificate

1 Lesson
    • Get Your Certificate