Skip to content

Evaluators ๐Ÿงช โ€‹

Note that evaluators are currently experimental.

Evaluators automatically validate prompt Completions against a set of user-defined rules, e.g. if whether or not the completion starts with a certain phrase, has a length in a specified range, matches the desired format, etc.

Evaluators can be created and configured under the Evaluators tab on the Prompts page.

Evaluator types โ€‹

Length Evaluators ๐Ÿงช โ€‹

Length evaluators validate the length of prompt completions, either to an exact value or to a range (min/max). The unit for the evaluation can be tokens, characters, or words, depending on preference and use case.

Text Evaluators ๐Ÿงช โ€‹

Text evaluators validate the exact match or presence/absence of a specific text string at the beginning, end, or anywhere in the prompt completion. Users can also specify whether the evaluation should be case-sensitive or not.

Regex Evaluator โ€‹

๐Ÿšง Coming soon...

Boolean Evaluator โ€‹

๐Ÿšง Coming soon...

Number Evaluator โ€‹

๐Ÿšง Coming soon...

JSON Evaluator โ€‹

๐Ÿšง Coming soon...

Format Evaluator โ€‹

๐Ÿšง Coming soon...

LLM Evaluator โ€‹

๐Ÿšง Coming soon...

Webhook Evaluator โ€‹

๐Ÿšง Coming soon...

Evaluator versioning (history) โ€‹

Evaluators automatically keep track of their configuration history, so that it is always fully traceable which evalutors were used with which configuration for each completion.

The version history of an evaluator automatically increments if the configuration is changed and the evaluator has been executed on at least one completion. If the evaluator has never been executed, updating the configuration will not lead to the creation of a new version.