Binary Search, as we all know requires the elements to be sorted. But we have to take care of unsorted elements too, in the worst case. If the input size is very large, is it a good idea to sort the elements everytime? Can we not just check the elements that they are not sorted or not and proceed to sorting and proceed to sorting only if they are unsorted?
1
checking if a list is sorted defeats the purpose of the binary search (using less comparisons for a O(log n)
running time instead of a O(n)
)
you are better of doing a linear search if you can’t be sure the list is already sorted
I had to deal with something similar a long time ago.
Maintain a large sorted file and a small unsorted file. The small one contains items that haven’t been merged in yet. Search the small one first, and, if that fails, then search the large one. “Every so often”, like when the small file gets to be a certain fraction of the size of the large file, sort the small file and then merge it with the large one.
I was facing this problem just yesterday. Here are three bits of info I can offer that may help:
- Try to keep a database view of the data you need to query that is always sorted. For example, I had a huge (500k records), unsorted data table that contained, among other things, a column for
CompanyName
. In my application, I needed to search these records based on a given company name, which is ripe for a binary search. I simply created a sorted SQL view that contains only theCompanyName
and theid
(which is the primary key in the original table`. Now I query that directly from the application and get the result in under 10ms. - If you can’t ensure ahead of time that your data set is sorted, then make good use of hashtables and look into the “Quick Sort” algorithm. This is a O(log n) algorithm so if you run it first and then a binary search against your data it is surely faster than a linear search.
- Also, try to make good use of caching–if you have already sorted a given dataset, there is no reason to do it again for the next query against it.
Good luck!
2