Evals for AI Agents
Buy now
Learn more
Discussions
1. What are Agent Evaluations?
1. Introduction to Evals
2. Generic vs. Targeted Evals
3. Regression Testing - Online vs Offline Evals
4. The 3 Data Pillars of Evaluation
Wrap Up Quiz
Practical 1: Intro to Course Project
2. Human-in-the-Loop Evals
6. Human-in-the-Loop Evaluation
7. Designing Human Evaluations
8. From Annotations to Patterns
Wrap Up Quiz
Practical 2: Observing and Annotating Your Traces
3. LLM-as-Judge
10. LLM-as-a-Judge
11. When to Use LLM-as-Judge
12. Building Effective Judge Prompts
Wrap Up Quiz
Practical 3: Creating Evaluators from Issues
4. Programmatic Rules
14. Programmatic Rule Evaluations
15. When to Use Programmatic Rules
16. Designing Effective Programmatic Rules
17. Integrating the 3 Types of Evals
Wrap Up Quiz
Practical 4: Creating a Golden Dataset
You made it!
Get Your Certificate
Products
Course
Section
Lesson
14. Programmatic Rule Evaluations
14. Programmatic Rule Evaluations
Evals for AI Agents
Buy now
Learn more
Discussions
1. What are Agent Evaluations?
1. Introduction to Evals
2. Generic vs. Targeted Evals
3. Regression Testing - Online vs Offline Evals
4. The 3 Data Pillars of Evaluation
Wrap Up Quiz
Practical 1: Intro to Course Project
2. Human-in-the-Loop Evals
6. Human-in-the-Loop Evaluation
7. Designing Human Evaluations
8. From Annotations to Patterns
Wrap Up Quiz
Practical 2: Observing and Annotating Your Traces
3. LLM-as-Judge
10. LLM-as-a-Judge
11. When to Use LLM-as-Judge
12. Building Effective Judge Prompts
Wrap Up Quiz
Practical 3: Creating Evaluators from Issues
4. Programmatic Rules
14. Programmatic Rule Evaluations
15. When to Use Programmatic Rules
16. Designing Effective Programmatic Rules
17. Integrating the 3 Types of Evals
Wrap Up Quiz
Practical 4: Creating a Golden Dataset
You made it!
Get Your Certificate
Lesson unavailable
Please
login to your account
or
buy the course
.