General purpose evaluators
This article refers to the Microsoft Foundry (new) portal.
The Microsoft Foundry SDK for evaluation and Foundry portal are in public preview, but the APIs are generally available for model and dataset evaluation (agent evaluation remains in public preview). Evaluators marked (preview) in this article are currently in public preview everywhere.
Coherence
The coherence evaluator measures the logical and orderly presentation of ideas in a response, which allows the reader to easily follow and understand the writer’s train of thought. A coherent response directly addresses the question with clear connections between sentences and paragraphs, using appropriate transitions and a logical sequence of ideas. Higher scores mean better coherence.Fluency
The fluency evaluator measures the effectiveness and clarity of written communication. This measure focuses on grammatical accuracy, vocabulary range, sentence complexity, coherence, and overall readability. It assesses how smoothly ideas are conveyed and how easily the reader can understand the text.Using general-purpose evaluators
General-purpose evaluators assess the quality of AI-generated text independent of specific use cases. Examples:| Evaluator | What it measures | Required inputs | Required parameters |
|---|---|---|---|
builtin.coherence | Logical flow and organization of ideas | query, response | deployment_name |
builtin.fluency | Grammatical accuracy and readability | response | deployment_name |
Example input
Your test dataset should contain the fields referenced in your data mappings:Configuration example
Data mapping syntax:{{item.field_name}}references fields from your test dataset (for example,{{item.query}}).{{sample.output_text}}references response text generated or retrieved during evaluation. Use this when evaluating with a model target or agent target.