My understandment is that hash maps allow us to link, say, a string, to certain memory location. But if every string were to be linked to a unique place in memory it would need a huge block of empty memory. I don’t get it.
2
Hash functions are used to convert the input (usually string) to a smaller fixed length (typically) “hashed value” which typically is used somewhat like a array index for the hash table. Ideally this value should be evenly spread throughout its range, and avoid any obvious/significant concentrations.
A hash function is any algorithm or subroutine that maps large data sets of variable length, called keys, to smaller data sets of a fixed length.
You have been miss informed with regards to the “certain memory location”. The use of such a function is to reduce a large piece of data into a much smaller value which can be used to identify that large data. Since you are reducing the size of the data however, it is likely that more than one input “data” will have the same hashed value – that is what is called collision.
A valid hash function (not a very good one, but valid still) could be one that sums up each of the letters of the string by their ASCII codes.
4
Hash and map are two different concepts. (And hence two tags hashing and map :-)).
The concept of map (in this context) is collection of key-value pairs, where key could be anything and value could be anything. Once you created a map, map should be able to retrieve value of a given key. A simple map could be storing the key-value pair in an array, or linked list. A map can be implemented using any storage technique. Popular are Red-Black Trees and Hashing.
Hashing is converting anything into a unique representation (most of the times as random as possible) which is limited by some range. Hashing is one way, means once you hash A into B there is no guaranteed way to get A from B.
A hash function is one which takes A as input and returns B as output. Two properties for hash functions are important.
- It should always return same B given the same A.
- Ideally it should return different B for different A. But it is impossible if the range of B is smaller than range of A.
Getting a good hash function is really difficult as maintaining property 2 is quite difficult. A simplest hash function is modulo operator. There exists a good lot of hash algorithms and libraries.
Now what you get when you combine these two? A hash map. The key is converted to hash and stored against the value. So if I have a pair (A,A'
), I will convert A to hash B and store (B, A')
. Now If I want to get what is the value stored against key A, I will first convert A to B and then see what is stored against B, and return it.
The advantage of using hash map is the fast retrieval. It is O(1), i.e. in constant time I can find whether a key is present and the value against it. Compare it to Red-Black Tree where the time complexity of retrieving is O(logn). (There is also time of inserting and deleting which I am not discussing here.)
So now tell me what’s your question?
14