I have a set of songs where I want to sort them by a particular quality. In order to do this (by crowdsourcing) I will present users with a comparison between two songs. The user will then choose which one ranks higher.
What algorithm can I use to weight the different comparisons? Given that I’ll have a data set of comparisons between two points such as…
[ A >= B : 2
A <= B : 0
A >= C : 1
A <= C : 4
...
( two dimensional array of size 2 * |songs - 1|^2 )
...
]
2
One possible formalization is as a longest path problem. Construct a directed graph in which nodes are labeled with songs, and for each ordered pair of songs (A, B) there is a weighted edge A -> B with weight equal to the number of votes for A < B. Then the longest simple path is a total order that agrees with the maximum number of votes. If the longest path contains the edge A -> B, then the consensus is that A < B.
Unfortunately, this problem is NP-hard unless the graph is acyclic (ie. no inconsistent rankings in your database). You may still be able to solve it in a reasonable amount of time depending on how many songs there are.
The problem doesn’t require an exact solution. And opinions are not an exact mesure anyway. You could simply count +1 every time a song is preferred and -1 every time it is not preferred.
A B C
A >= B : 2 +2 -2
A <= B : 0 0 0
A >= C : 1 +1 -1
A <= C : 4 -4 +4
----------
-1 -2 +3
This would give the order C > A > B.