HASHING TABLE - Data Structure

Hash Table is a data structure which stores data in an associative manner. In a hash table, data is stored in an array format, where each data value has its own unique index value. Access of data becomes very fast if we know the index of the desired data.
Thus, it becomes a data structure in which insertion and search operations are very fast irrespective of the size of the data. Hash Table uses an array as a storage medium and uses hash technique to generate an index where an element is to be inserted or is to be located from.

Hashing

Hashing is a technique to convert a range of key values into a range of indexes of an array. We're going to use modulo operator to get a range of key values. Consider an example of hash table of size 20, and the following items are to be stored. Item are in the (key,value) format.
Hash Function
  • (1,20)
  • (2,70)
  • (42,80)
  • (4,25)
  • (12,44)
  • (14,32)
  • (17,11)
  • (13,78)
  • (37,98)
Sr.No.KeyHashArray Index
111 % 20 = 11
222 % 20 = 22
34242 % 20 = 22
444 % 20 = 44
51212 % 20 = 1212
61414 % 20 = 1414
71717 % 20 = 1717
81313 % 20 = 1313
93737 % 20 = 1717

Linear Probing

As we can see, it may happen that the hashing technique is used to create an already used index of the array. In such a case, we can search the next empty location in the array by looking into the next cell until we find an empty cell. This technique is called linear probing.
Sr.No.KeyHashArray IndexAfter Linear Probing, Array Index
111 % 20 = 111
222 % 20 = 222
34242 % 20 = 223
444 % 20 = 444
51212 % 20 = 121212
61414 % 20 = 141414
71717 % 20 = 171717
81313 % 20 = 131313
93737 % 20 = 171718

Basic Operations

Following are the basic primary operations of a hash table.
  • Search − Searches an element in a hash table.
  • Insert − inserts an element in a hash table.
  • delete − Deletes an element from a hash table.
  • Hash Function: A function that converts a given big phone number to a small practical integer value. The mapped integer value is used as an index in hash table. In simple terms, a hash function maps a big number or string to a small integer that can be used as index in hash table.
    A good hash function should have following properties
    1) Efficiently computable.
    2) Should uniformly distribute the keys (Each table position equally likely for each key)
    For example for phone numbers a bad hash function is to take first three digits. A better function is consider last three digits. Please note that this may not be the best hash function. There may be better ways.
    Hash Table: An array that stores pointers to records corresponding to a given phone number. An entry in hash table is NIL if no existing phone number has hash function value equal to the index for the entry.
    Collision Handling: Since a hash function gets us a small number for a big key, there is possibility that two keys result in same value. The situation where a newly inserted key maps to an already occupied slot in hash table is called collision and must be handled using some collision handling technique. Following are the ways to handle collisions:
    • Chaining:The idea is to make each cell of hash table point to a linked list of records that have same hash function value. Chaining is simple, but requires additional memory outside the table.
    • Open Addressing: In open addressing, all elements are stored in the hash table itself. Each table entry contains either a record or NIL. When searching for an element, we one by one examine table slots until the desired element is found or it is clear that the element is not in the table.

DataItem

Define a data item having some data and key, based on which the search is to be conducted in a hash table.
struct DataItem {
   int data;
   int key;
};

Hash Method

Define a hashing method to compute the hash code of the key of the data item.
int hashCode(int key){
   return key % SIZE;
}

Search Operation

Whenever an element is to be searched, compute the hash code of the key passed and locate the element using that hash code as index in the array. Use linear probing to get the element ahead if the element is not found at the computed hash code.

Example

struct DataItem *search(int key) {
   //get the hash
   int hashIndex = hashCode(key);
 
   //move in array until an empty
   while(hashArray[hashIndex] != NULL) {
 
      if(hashArray[hashIndex]->key == key)
         return hashArray[hashIndex];
   
      //go to next cell
      ++hashIndex;
  
      //wrap around the table
      hashIndex %= SIZE;
   }

   return NULL;        
}

Insert Operation

Whenever an element is to be inserted, compute the hash code of the key passed and locate the index using that hash code as an index in the array. Use linear probing for empty location, if an element is found at the computed hash code.

Example

void insert(int key,int data) {
   struct DataItem *item = (struct DataItem*) malloc(sizeof(struct DataItem));
   item->data = data;  
   item->key = key;     

   //get the hash 
   int hashIndex = hashCode(key);

   //move in array until an empty or deleted cell
   while(hashArray[hashIndex] != NULL && hashArray[hashIndex]->key != -1) {
      //go to next cell
      ++hashIndex;
  
      //wrap around the table
      hashIndex %= SIZE;
   }
 
   hashArray[hashIndex] = item;        
}

Delete Operation

Whenever an element is to be deleted, compute the hash code of the key passed and locate the index using that hash code as an index in the array. Use linear probing to get the element ahead if an element is not found at the computed hash code. When found, store a dummy item there to keep the performance of the hash table intact.

Example

struct DataItem* delete(struct DataItem* item) {
   int key = item->key;

   //get the hash 
   int hashIndex = hashCode(key);

   //move in array until an empty 
   while(hashArray[hashIndex] !=NULL) {
 
      if(hashArray[hashIndex]->key == key) {
         struct DataItem* temp = hashArray[hashIndex]; 
   
         //assign a dummy item at deleted position
         hashArray[hashIndex] = dummyItem; 
         return temp;
      } 
  
      //go to next cell
      ++hashIndex;
  
      //wrap around the table
      hashIndex %= SIZE;
   }  
 
   return NULL;        }
Source: http://student-activity.binus.ac.id/himmat/2019/04/data-stucture/


Comments

Popular Posts