Evals for AI Agents
Buy now
Learn more
Discussions
1. What are Agent Evaluations?
1. Introduction to Evals
2. Generic vs. Targeted Evals
3. Regression Testing - Online vs Offline Evals
4. The 3 Data Pillars of Evaluation
Wrap Up Quiz
Practical 1: Intro to Course Project
2. Human-in-the-Loop Evals
6. Human-in-the-Loop Evaluation
7. Designing Human Evaluations
8. From Annotations to Patterns
Wrap Up Quiz
Practical 2: Observing and Annotating Your Traces
3. LLM-as-Judge
10. LLM-as-a-Judge
11. When to Use LLM-as-Judge
12. Building Effective Judge Prompts
Wrap Up Quiz
Practical 3: Creating Evaluators from Issues
4. Programmatic Rules
14. Programmatic Rule Evaluations
15. When to Use Programmatic Rules
16. Designing Effective Programmatic Rules
17. Integrating the 3 Types of Evals
Wrap Up Quiz
Practical 4: Creating a Golden Dataset
You made it!
Get Your Certificate
Products
Course
Section
1. What are Agent Evaluations?
1. What are Agent Evaluations?
Evals for AI Agents
Buy now
Learn more
Discussions
1. What are Agent Evaluations?
1. Introduction to Evals
2. Generic vs. Targeted Evals
3. Regression Testing - Online vs Offline Evals
4. The 3 Data Pillars of Evaluation
Wrap Up Quiz
Practical 1: Intro to Course Project
2. Human-in-the-Loop Evals
6. Human-in-the-Loop Evaluation
7. Designing Human Evaluations
8. From Annotations to Patterns
Wrap Up Quiz
Practical 2: Observing and Annotating Your Traces
3. LLM-as-Judge
10. LLM-as-a-Judge
11. When to Use LLM-as-Judge
12. Building Effective Judge Prompts
Wrap Up Quiz
Practical 3: Creating Evaluators from Issues
4. Programmatic Rules
14. Programmatic Rule Evaluations
15. When to Use Programmatic Rules
16. Designing Effective Programmatic Rules
17. Integrating the 3 Types of Evals
Wrap Up Quiz
Practical 4: Creating a Golden Dataset
You made it!
Get Your Certificate
Learn the fundamentals of AI evals, the three core types, and the data you need to get started.
6 Lessons
1. Introduction to Evals
2. Generic vs. Targeted Evals
3. Regression Testing - Online vs Offline Evals
4. The 3 Data Pillars of Evaluation
Wrap Up Quiz
Practical 1: Intro to Course Project