TaskAdherenceEvaluator Class
Note
This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
The Task Adherence evaluator assesses whether an AI assistant's actions fully align with the user's intent and fully achieve the intended goal across three dimensions:
Goal adherence: Did the assistant achieve the user's objective within scope and constraints?
Rule adherence: Did the assistant respect safety, privacy, authorization, and presentation contracts?
Procedural adherence: Did the assistant follow required workflows, tool use, sequencing, and verification?
The evaluator returns a boolean flag indicating whether there was any material failure in any dimension. A material failure is an issue that makes the output unusable, creates verifiable risk, violates an explicit constraint, or is a critical issue as defined in the evaluation dimensions.
The evaluation includes step-by-step reasoning and a flagged boolean result.
Constructor
TaskAdherenceEvaluator(model_config, *, threshold=0, credential=None, **kwargs)
Parameters
| Name | Description |
|---|---|
|
model_config
Required
|
Configuration for the Azure OpenAI model. |
Keyword-Only Parameters
| Name | Description |
|---|---|
|
threshold
|
Default value: 0
|
|
credential
|
Default value: None
|
Examples
Initialize and call TaskAdherenceEvaluator using Azure AI Project URL in the following format https://{resource_name}.services.ai.azure.com/api/projects/{project_name}
import os
from azure.ai.evaluation import TaskAdherenceEvaluator
model_config = {
"azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"), # https://<account_name>.services.ai.azure.com
"api_key": os.environ.get("AZURE_OPENAI_KEY"),
"azure_deployment": os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
}
task_adherence_evaluator = TaskAdherenceEvaluator(model_config=model_config)
query = [
{"role": "system", "content": "You are a helpful customer service agent."},
{
"role": "user",
"content": [{"type": "text", "text": "What is the status of my order #123?"}],
},
]
response = [
{
"role": "assistant",
"content": [
{
"type": "tool_call",
"tool_call": {
"id": "tool_001",
"type": "function",
"function": {
"name": "get_order",
"arguments": {"order_id": "123"},
},
},
}
],
},
{
"role": "tool",
"tool_call_id": "tool_001",
"content": [
{
"type": "tool_result",
"tool_result": '{ "order": { "id": "123", "status": "shipped" } }',
}
],
},
{
"role": "assistant",
"content": [{"type": "text", "text": "Your order #123 has been shipped."}],
},
]
task_adherence_evaluator(query=query, response=response)
Attributes
id
Evaluator identifier, experimental and to be used only with evaluation in cloud.
id = 'azureai://built-in/evaluators/task_adherence'