Learn to automate human-like judgment at scale using a model to score your agent's outputs against criteria you define.