A simple smile: Data Structure

Showing posts with label Data Structure. Show all posts

February 6, 2013

Hash Table 3 -- Hashing

The idea of hashing is to distribute the entries (key/value pairs) across an array of buckets. 
Given a key (e) and an array--hashes with a size of arraySize, the algorithm-hash function (h) computes an entries to map the key to an index (i) between {0, arraySize-1}, which suggests where the entry can be found.

[1]
[1] Image from Wikipedia, A hash function that maps names to integers from 0 to 15.
There is a collision between keys "John Smith" and "Sandra Dee".

Given:

        e: key
        h: hash function
arraySize: the size of the array
        i: integer between {0, arraySize-1}

Then:

       i = h(e, arraySize)

Often above function is deduced from two steps:

hashCode = h(e)
       i = hashCode % arraySize

1.According to the hash function (h), we got the integer type hashCode derived from (e).
2.Then we got the index which is hashCode % arraySize, because the return number must fall in {0, array_size – 1}, by using a remainder operation (%).

Usually, the range of (e) values is much bigger than arraySize. We can image there might be a worse situation – for two different keys (e1) and (e2), if they have h(e1, arraySize) == h(e2, arraySize), then (e1) and (e2) are mapped into the same index of an array. This calls Collision.

Obviously, the most important part in this calculation is hashCode. The better hash function, the better hashCode is and the less collision we got. 

Generally, a good hash function requires,

Easy to calculate, which means time complexity of hash function is Constant Time or Fixed Time.
Uniformly distribute keys to array to reduce the chance of collision.

Hash Table 2

A Hash Table can map key to value for highly efficient lookup. 
A Hash Table uses a  hash function to compute an index into an array of buckets or slots, from which the correct value can be found.

[1]
[1] Image from Wikipedia, A small phone book as a hash table.

The goal of Hash Table is,

Keys are uniformly distributed in it, regardless of the sequence.
It supports insert, remove and find operations.
Each operation has better time complexity than O(log n).
Then in the next post, the method of  Hashing will be introduced to use on the Hash Table.

January 28, 2013

Javarevisited: 10 example of using ArrayList in Java >>> Java...: ArrayList in Java is most frequently used collection class after HashMap in Java . Java ArrayList represents an automatic re-sizable array ...

January 27, 2013

HASHMAP vs HASHTABLE vs HASHSET

Repost from this blog

I was reading about collection framework of Java. And was studying Hashtable, HashMap and HashSet. Its quite interesting to know the differences between them. In this post I will discuss these three with examples.

Hashtable

Hashtable is basically a datastructure to retain values of key-value pair.

It didn’t allow null for both key and value. You will get NullPointerException if you add null value.
It is synchronized. So it comes with its cost. Only one thread can access in one time

Hashtable<Integer,String>; cityTable = new Hashtable<Integer,String>();
cityTable.put(1, "Lahore");
cityTable.put(2, "Karachi");
cityTable.put(3, null); /* NullPointerEcxeption at runtime*/

System.out.println(cityTable.get(1));
System.out.println(cityTable.get(2));
System.out.println(cityTable.get(3));

HashMap

Like Hashtable it also accepts key value pair.

It allows null for both key and value
It is unsynchronized. So come up with better performance

HashMap<Integer,String> productMap = new HashMap<Integer,String>();
productMap.put(1, "Keys");
productMap.put(2, null);

HashSet

HashSet does not allow duplicate values. It provides add method rather put method. You also use its contain method to check whether the object is already available in HashSet. HashSet can be used where you want to maintain a unique list.

HashSet<String> stateSet = new HashSet<String>();
stateSet.add ("CA");
stateSet.add ("WI");
stateSet.add ("NY");

if (stateSet.contains("PB")) /* if CA, it will not add but shows following message*/
     System.out.println("Already found");
else
    stateSet.add("PB");

Hash Table 1

Hash Table is an important data structure in computer science world. I have looked up several documents, books and articles about it. However, some are too abstract to understand and some are too detailed to digest all the concepts. Here, I would like to make it easier for understanding Hash Table. This article is intended for an introduction of Hash Table and is wrote in the way how I learned and understand the Hash Table from the very beginning, so that later you can quickly pick up other Hash Table knowledge further on your own.

Chapter 1 Set

First of all, A good usage of data structure enables you to have the following operations efficiently, such asadd or insert, remove or delete and look-up or find. In fact, we can go further and say that using Hash Table to execute these above operations is faster than Tree Set, which is also faster than Array.

Velocity: Hash Table > Tree Set > Array

So Hash Table can provide better Time Complexity in Big O notation.
The smaller the number between brackets, the better time complexity is.
E.g. O(100) > O(log 100)

It also means how long it should take to find an element in these three data structure.

Array	Tree Set	Hash Table^[¹^]
O(n)	O(log n)	O(1) / O(1+n/k)^[²^]

1. The sharp difference among them is because Tree Set requires maintaining sequence between elements, whiling Hash Table does not. We will have further discussion on the time complexity for Hash Table later. No worries!
2. k: number of buckets (Refer to WikiPedia)