Context:
I am using NDCG (Normalized Discounted Cumulative Gain) to evaluate a Semantic Search system on a ground truth dataset containing relevance scores. I would like to use sklearn’s ndcg_score() for this.
What are the ways to handle
- False Positive Documents: For a given query, those documents which are present in the search system’s response but not in the ground truth data
- False Negative Documents: For a given query, those documents which are present in the ground truth data but not in the search system’s response
One possibility is to insert predicted score=0 for False Negatives and ignore False Positives. But I am not entirely sure if this is the correct approach.