We’re trying to migrate our LangSmith-JS tests from using runOnDataset()
to evaluate()
. Unfortunately, the documentation does not seem very clear to me, at least not for JavaScript.
Here is what we used to do:
const correctnessEvaluator = LabeledCriteria('correctness', {
formatEvaluatorInputs: (payload) => ({
prediction: payload.rawPrediction.output,
input: JSON.stringify(payload.rawInput),
reference: payload.rawReferenceOutput.output,
}),
});
return await runOnDataset(
getTextOutputFromBot,
datasetName,
{
evaluators: [correctnessEvaluator],
},
);
The above worked and gave us results in LangSmith in the Correctness column.
Here is what I am trying to run now:
return await evaluate(async (inputs) =>
await getTextOutput({ question: inputs.question }), {
data: dataset,
evaluators: [correctnessEvaluator],
experimentPrefix: dataset.substring(0, dataset.indexOf('Dataset')),
numRepetitions: 1,
});
I get the following error when I run the above:
console.error
Error running evaluator evaluateRun on run bc7b97db-45db-4f30-a66d-a8099dd29228: TypeError: evaluator is not a function
146 | summaryEvaluators?: any;
147 | }) => {
> 148 | return await evaluate(async (inputs) =>
| ^
149 | await getTextOutput({ question: inputs.question }), {
150 | data: dataset,
151 | evaluators: Array.isArray(evaluators) ? evaluators : [evaluators],
at _ExperimentManager._runEvaluators (node_modules/langsmith/dist/evaluation/_runner.cjs:418:25)
at _ExperimentManager._score (node_modules/langsmith/dist/evaluation/_runner.cjs:439:17)
at makeIter (node_modules/langsmith/dist/utils/atee.cjs:9:32)
at node_modules/langsmith/dist/evaluation/_runner.cjs:309:34
at _ExperimentManager.getResults (node_modules/langsmith/dist/evaluation/_runner.cjs:339:30)
at ExperimentResults.processData (node_modules/langsmith/dist/evaluation/_runner.cjs:608:26)
at _evaluate (node_modules/langsmith/dist/evaluation/_runner.cjs:646:5)
Can anyone tell me what I am doing wrong here?