Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure AI Foundry ValueError running evaluator (coherence, fluency) ValueError caused by unexpected evaluation LLM outputs. #39011

Open
khall-ms opened this issue Jan 2, 2025 · 1 comment
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. Evaluation Issues related to the client library for Azure AI Evaluation needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.

Comments

@khall-ms
Copy link

khall-ms commented Jan 2, 2025

  • Package Name: azure.ai.evaluation
  • Package Version: Unknown
  • Operating System: Azure AI Foundry
  • Python Version: python3.9

Describe the bug
When completing the Evaluate the performance of generative AI apps with Azure AI Foundry course the automated Coherence and Fluency evaluations fail due to ValueErrors.

To Reproduce
Steps to reproduce the behavior:

  1. Follow the course Exercise
  2. Run the Automated evaluations
  3. Depending on the LLM output you will get failures.

Expected behavior
A clear and concise description of what you expected to happen.

Expect for the evaluation LLMs to provide a consistent output and have a common util which can effectively catch edge cases.

Screenshots
If applicable, add screenshots to help explain your problem.
Image
Image
Image

Additional context
Could be improved by updating/re-evaluating the few shot prompts (possibly using the promptflow style or even add Chain of Thought. Also could change parse_quality_evaluator_reason_score function to pull out just the number for the scoring component.

@github-actions github-actions bot added customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Jan 2, 2025
@xiangyan99 xiangyan99 added Service Attention Workflow: This issue is responsible by Azure service team. Evaluation Issues related to the client library for Azure AI Evaluation and removed needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. labels Jan 2, 2025
@github-actions github-actions bot added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Jan 2, 2025
Copy link

github-actions bot commented Jan 2, 2025

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @luigiw @needuv @singankit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. Evaluation Issues related to the client library for Azure AI Evaluation needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

2 participants