I want to do research on the content correlation of short sentences. For example, sentence 1 is “The movie xxx is very beautiful” and sentence 2 is “The plot of the TV series xxx is very rich”. These two sentences are semantically inconsistent, but they are both about videos, so they are related. I have a lot of this data, and I want to complete the sentence correlation detection judgment, but the existing models are all about similarity.
I try to use sentence bert but it does not achieve my purpose, because it is to do sentence semantic similarity; Secondly, I think to find the key words of the sentence, and then do the theme mapping. But some key words in short sentences do not express the topic, and some proper nouns do not know how to identify? If the two sentences do not write a movie or a TV series, only write sentence 1 “xxx(the name of the movie) the plot is very rich” sentence 2 “xxx(the name of the TV series) the actor’s acting is very good” semantic is not consistent, but it is a theme “video”. How to map sentence content and topic? Consider methods other than text classification.
user25520304 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.