Is it okay to compute a threshold to determine if a generated output of generative model is good or not?
I want to filter it out if it not good before give it to user.
I have an idea that:
confidence = p(x_i)*p(x_i+1)*... for x_i in 0..n
then need to normalize by the length of the input sequence.
I heard about perplexity but not sure it can be computed without ground truth or not.