Share via


TaskAdherenceEvaluator Class

Note

This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.

The Task Adherence evaluator assesses whether an AI assistant's actions fully align with the user's intent and fully achieve the intended goal across three dimensions:

  • Goal adherence: Did the assistant achieve the user's objective within scope and constraints?

  • Rule adherence: Did the assistant respect safety, privacy, authorization, and presentation contracts?

  • Procedural adherence: Did the assistant follow required workflows, tool use, sequencing, and verification?

The evaluator returns a boolean flag indicating whether there was any material failure in any dimension. A material failure is an issue that makes the output unusable, creates verifiable risk, violates an explicit constraint, or is a critical issue as defined in the evaluation dimensions.

The evaluation includes step-by-step reasoning and a flagged boolean result.

Constructor

TaskAdherenceEvaluator(model_config, *, threshold=0, credential=None, **kwargs)

Parameters

Name Description
model_config
Required

Configuration for the Azure OpenAI model.

Keyword-Only Parameters

Name Description
threshold
Default value: 0
credential
Default value: None

Examples

Initialize and call TaskAdherenceEvaluator using Azure AI Project URL in the following format https://{resource_name}.services.ai.azure.com/api/projects/{project_name}


   import os
   from azure.ai.evaluation import TaskAdherenceEvaluator

   model_config = {
       "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),  # https://<account_name>.services.ai.azure.com
       "api_key": os.environ.get("AZURE_OPENAI_KEY"),
       "azure_deployment": os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
   }

   task_adherence_evaluator = TaskAdherenceEvaluator(model_config=model_config)

   query = [
       {"role": "system", "content": "You are a helpful customer service agent."},
       {
           "role": "user",
           "content": [{"type": "text", "text": "What is the status of my order #123?"}],
       },
   ]

   response = [
       {
           "role": "assistant",
           "content": [
               {
                   "type": "tool_call",
                   "tool_call": {
                       "id": "tool_001",
                       "type": "function",
                       "function": {
                           "name": "get_order",
                           "arguments": {"order_id": "123"},
                       },
                   },
               }
           ],
       },
       {
           "role": "tool",
           "tool_call_id": "tool_001",
           "content": [
               {
                   "type": "tool_result",
                   "tool_result": '{ "order": { "id": "123", "status": "shipped" } }',
               }
           ],
       },
       {
           "role": "assistant",
           "content": [{"type": "text", "text": "Your order #123 has been shipped."}],
       },
   ]

   task_adherence_evaluator(query=query, response=response)

Attributes

id

Evaluator identifier, experimental and to be used only with evaluation in cloud.

id = 'azureai://built-in/evaluators/task_adherence'