Evals for AI Agents
Buy now
Learn more
Discussions
1. What are Agent Evaluations?
1. Introduction to Evals
2. Generic vs. Targeted Evals
3. Regression Testing - Online vs Offline Evals
4. The 3 Data Pillars of Evaluation
Wrap Up Quiz
Practical 1: Intro to Course Project
2. Human-in-the-Loop Evals
6. Human-in-the-Loop Evaluation
7. Designing Human Evaluations
8. From Annotations to Patterns
Wrap Up Quiz
Practical 2: Observing and Annotating Your Traces
3. LLM-as-Judge
10. LLM-as-a-Judge
11. When to Use LLM-as-Judge
12. Building Effective Judge Prompts
Wrap Up Quiz
Practical 3: Creating Evaluators from Issues
4. Programmatic Rules
14. Programmatic Rule Evaluations
15. When to Use Programmatic Rules
16. Designing Effective Programmatic Rules
17. Integrating the 3 Types of Evals
Wrap Up Quiz
Practical 4: Creating a Golden Dataset
You made it!
Get Your Certificate
Products
Course
Section
Lesson
3. Regression Testing - Online vs Offline Evals
3. Regression Testing - Online vs Offline Evals
Evals for AI Agents
Buy now
Learn more
Discussions
1. What are Agent Evaluations?
1. Introduction to Evals
2. Generic vs. Targeted Evals
3. Regression Testing - Online vs Offline Evals
4. The 3 Data Pillars of Evaluation
Wrap Up Quiz
Practical 1: Intro to Course Project
2. Human-in-the-Loop Evals
6. Human-in-the-Loop Evaluation
7. Designing Human Evaluations
8. From Annotations to Patterns
Wrap Up Quiz
Practical 2: Observing and Annotating Your Traces
3. LLM-as-Judge
10. LLM-as-a-Judge
11. When to Use LLM-as-Judge
12. Building Effective Judge Prompts
Wrap Up Quiz
Practical 3: Creating Evaluators from Issues
4. Programmatic Rules
14. Programmatic Rule Evaluations
15. When to Use Programmatic Rules
16. Designing Effective Programmatic Rules
17. Integrating the 3 Types of Evals
Wrap Up Quiz
Practical 4: Creating a Golden Dataset
You made it!
Get Your Certificate
Lesson unavailable
Please
login to your account
or
buy the course
.