Relative Content

Tag Archive for pythonmachine-learningmetricslarge-language-modelcosine-similarity

What is the best metrics?

I want to compare two LLM model that answer a question and verify if the answer is correct. It’s challenging to determine with certainty if an answer is truly correct because the models might use synonyms. Comparing their response to the correct answer is therefore complicated. However, I’d like to know which metric would be best for checking this?