In Microsoft AI Studio, there is a functionality that allows for the evaluation of Prompt Flow outputs in parallel with a Prompt Flow run. However, we have observed that when running this evaluation locally through the VS Code extension, the process does not fully incorporate the Prompt Flow output collection into the evaluation result. Specifically, the code generated during the evaluation run in Azure AI Studio appears to omit the integration of the Prompt Flow outputs into the evaluation process.
Currently, we are only able to execute the evaluation locally using a pre-existing dataset that includes the question, the ground truth answer, and the Prompt Flow output. This limitation restricts our ability to dynamically specify a new Prompt Flow for evaluation alongside the relevant dataset containing the question and ground truth.
Additionally, while running the evaluation locally via the Prompt Flow VS Code extension, we appreciate the feature that allows us to visualize the run results in a well-organized table format. However, unlike in Azure AI Studio, there seems to be no option to export these results into a CSV or JSON format for further analysis and reporting:
Our primary queries are as follows:
- Is there an option or recommended approach for local execution where we can define a specific Prompt Flow for evaluation, in conjunction with the corresponding evaluation Prompt Flow and dataset (comprising the question and ground truth)?
- Are there any configuration settings, environment variables, or specific steps that we need to follow to ensure a comprehensive evaluation process locally?
- Is there an export feature for the evaluation results in CSV or JSON format within the Prompt Flow VS Code extension, similar to the functionality available in Azure AI Studio?