I’m looking for the most efficient implementation choice to select the top k nodes with the highest values from an array of size N, which could potentially be very large due to containing PageRank values for each node. Each node is represented by its index in the array, and I need to return the top k highest values along with their corresponding nodes. Using qsort directly on the array to sort it might not be efficient, as it can have a worst-case complexity of O(n^2), although it should be O(NlogN), but I could have a milion nodes. And I always need to know the node associated with each value, so I would need to create a dedicated structure for the node-value pair. Another option I’ve considered is using a min heap, which would be more efficient compared to the O(n^2) complexity of qsort. Or could it make sense to implement sorting using threads, where each thread sorts a portion of the array, followed by a merge to combine the sorted parts?
Could you suggest something? I’m not sure how to improve in terms of efficiency
Maria Ferri is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.