I have a set of numbers say : 1 1 2 8 5 6 6 7 8 8 4 2…
I want to detect the duplicate element in subsets(of given size say k) of the above numbers…
For example :
Consider the increasing subsets(for example consider k=3)
Subset 1 :{1,1,2}
Subset 2 :{1,2,8}
Subset 3 :{2,8,5}
Subset 4 :{8,5,6}
Subset 5 :{5,6,6}
Subset 6 :{6,6,7}
….
….
So my algorithm should detect that subset 1,5,6 contains duplicates..
My approach : 1)Copy the 1st k elements to a temporary array(vector) 2) using #include file in C++ STL…using unique() I would determine if there’s any change in size of vector..
Any other clue how to approach this problem..
3
Take a subset, add each element into a hash table. If the hash table already contains a value – print out saying there is a duplicate.
A hash table will only allocate 1 memory block per entry. Hence its easy to find if the number already exists. Since these are just regular entries your table will be something like this:
struct Hashtable{
int number;
};
static struct Hashtable hashTable[10];
int getHash(int x){ return x; }
void addHash(int number)
{
if(hashTable[number].number == -1)
{ /*OK*/
hashTable[number].number = number;
}else{
printf("DUPLICATE FOUND! BAILING OUT!");
}
}
void initTable(){ for(int i = 0; i < 10; i++){hashTable[i].number = -1;}}
4
If they’re all sets then by definition they don’t contain duplicates. If they’re eg. lists then you could just make sets of the sublists. In Python 2.x:
items = [1, 1, 2, 8, 5, 6, 6, 7, 8, 8, 4, 2]
for i in xrange(len(items) - 2):
sublist = items[i:i + 3]
if len(set(sublist)) < 3:
print sublist, 'contains duplicates.'
else:
print sublist, 'contains no duplicates.'
Result:
[1, 1, 2] contains duplicates.
[1, 2, 8] contains no duplicates.
[2, 8, 5] contains no duplicates.
[8, 5, 6] contains no duplicates.
[5, 6, 6] contains duplicates.
[6, 6, 7] contains duplicates.
[6, 7, 8] contains no duplicates.
[7, 8, 8] contains duplicates.
[8, 8, 4] contains duplicates.
[8, 4, 2] contains no duplicates.
1