Java Collection Framework

A framework is a collection of classes and interfaces that facilitate development of specific kind of applications by providing an architectural model and reusable components.

Collection framework provides a unified model to work with different types of data structures such as arrays, linked list, BST, Hash Table etc.

At the core of Collection framework is an interface named java.util.Collection. This interface describes the common behaviour of all collections.

Methods of Collection interface –

a) add()             d) removeAll()     g) containsAll()

b) addAll()         e) retainAll()        h) size()

c) remove()       f) contains()         i) clear()

Iterator:- is an interface of Collection framework implementation of which are provided by all Collections. It is used for traversing the elements of Collection in implementation independent fashion.

Methods of iterator interface –

a) hasNext() – returns true if there is an element to be traversed.

b) next() – used to obtain next element from collection.

c) remove() – used to remove last traversed element from the collection.

List 

List represents indexed collection. Some commonly used methods are –

a) add()              d) indexOf()

b) addAll()          e) lastIndexOf()

c) get()               f) listIterator()

ListIterator facilitate traversing element of list in both directions. ListIterator extends iterator and adds following methods –

a) hasPrevious() – returns true if there is an element behind current index.

public boolean hasPrevious();

b) previous() – public Object previous()

c) add() – used to insert an element at the current index in a list.

public void add(Object element)

ArrayList

ArrayList provides implementation of List interface through Abstract List class. An object of array list represents a dynamic array.

Set

Non-indexed collection of unique element. Functionality of set represented by Set Interface which extends Collection interface but doesn’t define any additional method. It simply specifies that implementation of add method must be provided in such a manner that duplicate elements are not added.

HashSet and TreeSet classes provides implementation of Set interface through Abstract Set class.

Difference between HashSet and TreeSet is in their internal data structure. HashSet uses hash table for storing elements and TreeSet uses BST.

Map

It is a collection in which values are stored by associating them with keys which are used to retrieve values.

Functionality of Map is described by Map Interface which contains following methods –

1) put() – is used to store a key value pair in the map.

public boolean put(Object key, Object value)

2) get() – is used to obtain a value from a map.

public Object get(Object key)

3) remove() – is used to remove key-value pair from the map.

public boolean remove(Object key)

4) containsKey() – is used to search a key in the map

public boolean containsKey(Object key)

5) contains() – is used to search a value in the map

public boolean contains(Object value)

6) size() – used to find out number of elements in map.

public int size()

7) keySet() – returns a set for traversing the keys of a map.

public Set keySet();

8) entrySet() – returns a set for all the key value pairs of the Map.

public Set EntrySet();

The set that is returned by the entrySet() method contains objects of type Map.Entry. Entry is a nested interface of Map. All implementation of Map provides implementation of this interface as well.

This interface provides 2 methods –

a) getKey() – public Object getKey()

b) getValue() – public Object getValue()

HashMap and TreeMap classes provides implementation of Map interface through Map interface class.

class MapDemo

{

    public static void main(String[] args)

    {

        HashMap map = new HashMap();

        map.put("A",1);

        map.put("B",2);

        map.put("C",3);

        map.put("D",4);

        map.put("E",5);

        System.out.println("There are" + map.size() + "elements in map");

        System.out.println("Contents of map are");

        Set s = map.entrySet();

        Iterator itr = s.iterator();

        while(itr.hasNext())

        {
            Map.Entry m = (Map.Entry)itr.next();

            System.out.println(m.getKey() + "\t" + m.getValue());
        }
    }
}

Searching and Removing User Defined objects in Collection

Whenever an element is searched in a list using contains() method an exhaustive search is performed using equals() method i.e success or failure of a search operation entirely depends on the implementation of equals method.

Default implementation of equals() method on Object class compares the value of references not the contents of Object. Any two objects in Java cannot have same reference so equals() method returns false.

In case of String, String class have override the equals() method to match the contents.

In order to successfully search object of a class in a list equals() method must be overridden in the class to facilitate content wise comparison of objects.

public boolean equals(Object o)

{
    Emp e = (Emp)o;

    return this.name.equals(e.name) && this.job.equals(e.job) && this.salary==e.salary;   
}

HashSet

HashTable is implemented as an array of fixed size list which are called buckets.

In order to store an element in hashtable hashing function is used which returns index of a bucket in which element is stored. A simple hashing function can be of the following form –

h(e) = e%n

where e is the element to be stored and n is the number of buckets.

Most of the hashing functions are numeric which implies that only numeric elements can be added to a hash table. But we have to store objects. Every object can be represented as String(toString()) and a number(hashCode()).

In order to store object in hashing based collection objects must be represented by numbers. To facilitate numeric representation of object hashCode() method is provided in object class.

In a hashing based collection when a element is searched using contains() method following steps are performed –

a) HashCode of object is obtained.

b) On the hashCode of object hashing function is applied.

c) Object to be searched is compared with the elements of identified bucket using equals() method i.e. in a hashing based collection a selective search is performed.

Success or failure of a search operation depends on 2 factors –

a) Implementation of hashCode() method which is used to identify the bucket.

b) Implementation of equals() method which is used to compare element to be searched with the elements of the identified bucket.

public int hashCode()

{
    return name.hashCode() + job.hashCode() + salary;
}

If hashCode is same then they will be added in same bucket if contents are different then equals method will catch it.

TreeSet

It is BST based collection.

In order to add objects of a class in a tree based collection class must contain sorting logic.

Sorting logic can be added to a class in following 2 ways –

a) java.lang.Comparable interface

b) java.util.Comparator interface

a) Comparable interface is used to make a class comparable i.e if a class is comparable order of two objects of a class is obtained.

This interface provides a single method compareTo().

public int compareTo(Object o);

public int compareTo(Object o)

{
    Emp e = (Emp)o;

    return this.name.compareTo(e.name); // sort on name basis

    //sort on salary basis

    return this.salary-e.salary
}

Using the comparable interface single sorting order can be defined for the objects of a class.

b) Comparator interface is used to define multiple sorting orders for the objects of a class. It contains 2 methods –

public int compare(Object o, Object p)

public boolean equals(Object o)

Multiple sorting order can defined using Comparator because Comparator interface is implemented in a separate class i.e. for defining multiple sorting orders for the objects of a class multiple implementation of comparator interface can be provided.

Note – Comparable is implicitly used by Tree based collection whereas comparator need to be explicitly specified. If Comparator is given then sorting is based on that if not then compiler will check for comparable.

Ex – TreeSet ts1  = new TreeSet(new JobComparator());

class JobComparator implements Comparator
{
    public int compare(Object o, Object p)
    {

        Emp e = (Emp)o;

        Emp f = (Emp)p;

        return e.job.compareTo(f.job);
    }
}

class SalaryComparator implements Comparator

{

    public int compare(Object o, Object p)

    {

        Emp e = (Emp)o;

        Emp f = (Emp)p;

        return e.salary-f.salary;

    }

}

Complexities –

1) ArrayList – O(n)

2) HashSet, HashMap – O(1) atmost k where k is the size of buckets.

3) TreeSet, TreeMap – O(lgn) equals to height of BST.

Load Factor

It is a constant whose value is between 0 to 1. Value of load factor determines when the size of buckets is to be modified. In a hash table performance of insertion/search operation and space utilization are two inversely proportional objectives i.e if space utilization is maximized performance is decreased because optimization of space utilization results in hash collision which degrades performance. Optimization of performance results in under utilization of space.

By default value of load factor is 0.75 which represents that size of buckets is increased when 75% buckets are filled.

Vector

It is a legacy collection that represents dynamic array. After the introduction of collection framework vector is modified to make it compatible to the collection framework.

Difference between Vector and  ArrayList

1) Major difference between vector and array list is of thread safety. Vector is thread safe whereas ArrayList is not. Thread safety degrades performance.

2) Vector contains methods of collection and list interface as well as its own legacy methods.

3) Apart from iterator a vector provides enumeration for traversing its elements.

Enumeration is a legacy interface that provides methods which are used for traversing the elements of legacy collection. It provides following methods –

a) hasMoreElements() – returns true if there is an element to be traversed

public boolean hasMoreElements()

b) nextElement() – returns the reference of next element.

public Object nextElement();

ensureCapacity() – used to increase the capacity of Vector.

public void ensureCapacity(int requiredCapacity);

Major difference between enumerator and iterator is in their implementation. An iterator is designed to fail fast i.e an iterator fails as soon as the modification to the underlying collection is detected by throwing ConcurrentModification Exception.

The reason behind such a implementation is that iterator is mostly used for traversing the contents of non-thread safe collection in which consistency of modification operation is not guaranteed.

Enumeration is designed synchronized. Enumeration is used only with thread safe collection. In thread safe collection there is guarantee of consistency of data in case of concurrent modification hence enumeration never fails.

About neer1304

An INTJ Curious Geek
This entry was posted in Framework, Java and tagged . Bookmark the permalink.

Leave a comment