Friday, November 29, 2013

Fail-Safe vs Fail-Fast Iterator in Java Collections

Fail-Safe Iterator (java.util.concurrent - ConcurrentSkipListSet, CopyOnWriteArrayList, ConcurrentMap)

Fail-safe Iterator is "Weakly Consistent" and does not throw any exception if collection is modified structurally during the iteration. Such Iterator may work on clone of collection instead of original collection - such as in CopyOnWriteArrayList. While ConcurrentHashMap's iterator returns the state of the hashtable at some point at or since the creation of iterator. Most collections under java.util.concurrent offer fail-safe Iterators to its users and that's by Design. Fail safe collections should be preferred while writing multi-threaded applications to avoid concurrency related issues.
Fail Safe Iterator is guaranteed to list elements as they existed upon construction of Iterator, and may reflect any modifications subsequent to construction (without guarantee).

Fail-Fast Iterator (java.util package - HashMap, HashSet, TreeSet, etc)
 
Iterator fails as soon as it realizes that the structure of the underlying data structure has been modified since the iteration has begun. Structural changes means adding, removing any element from the collection, merely updating some value in the data structure does not count for the structural modifications. It is implemented by keeping a modification count and if iterating thread realizes the changes in modification count, it throws ConcurrentModificationException.

Most collections in package java.util are fail-fast by Design. These are not meant for multi-threading/ structural modification while iteration. Collection.toString() method requires iteration over the elements of collection, so they may throw exception if some parallel thread modifies the underlying collection data at the same time. This may happen even while logging a collection in the logger/Sysytem.out.println()

Thursday, September 26, 2013

When/How should we create a custom checked and unchecked exception in Java ?

Checked Exceptions Represents exceptional scenario which if occurred, must dealt with in some way. example is IOException, FileNotFoundException. We need to declare these exceptions along with the code dealing with such scenarios. Custom checked exceptions can be created by extending your class from java.lang.Exception Class.

Unchecked/Runtime Exceptions Represents an error in our program's logic which can not be reasonably recovered from at run time, for example NullPointerException, ArrayIndexOutOfBoundsException. We do not need to declare/catch such exception in the method signature because these are not expected by any programmer. Custom unchecked exceptions can be created by extending from RuntimeException

Error is a subclass of Throwable that indicates serious problems that a reasonable application should not try to catch. A Custom error can be created by extending our class from Throwable.

Monday, August 5, 2013

How will you convert DateTime from one TimeZone to another in Java ?

java.util.Date class is not TimeZone aware, as it does not store any time zone specific information in its object. This is clearly mentioned in the Java Docs for Date Class -

In Date, A milliseconds value represents the number of milliseconds that have passed since January 1, 1970 00:00:00.000 GMT.

The internal representation of the time inside Date object remains same for a given time, when we print the date object using System.out.println(date) method then date.toString() method is invoked which prints the date in local TimeZone of the JVM.

Custom TimeZone formatting can be achieved using SimpleDateFormat class.


Calendar instance = Calendar.getInstance();
Date date = instance.getTime();
DateFormat formatter= new SimpleDateFormat("MM/dd/yyyy hh:mm:ss Z");
formatter.setTimeZone(TimeZone.getTimeZone("Europe/London"));
System.out.println(formatter.format(date));
formatter.setTimeZone(TimeZone.getTimeZone("Asia/Calcutta"));
System.out.println(formatter.format(date)) ;

Thus a given time in milliseconds can be represented in different TimeZone using different TimeZone specific Date formatters.

Notes


Always prefer to user Calendar API over Date due to various benefits of Calendar Class - Calendar handles TimeZone information and it correctly measures the duration of a year in milliseconds keeping into account the leap years.

Question : What will be output of the following Java Program ?

Calendar instance = Calendar.getInstance(TimeZone.getTimeZone("Asia/Calcutta"));
Date date = instance.getTime();
System.out.println("date = " + date);
instance.setTimeZone(TimeZone.getTimeZone("GMT"));
Date date2 = instance.getTime();
System.out.println("date2 = " + date2);

Answer
Both the System.out will print the same date value because Date class object is always printed in local TimeZone and changing the TimeZone on Calendar class does not alter the underlying milliseconds value from the epoch time (Since January 1, 1970 00:00:00.000 GMT).

Sunday, June 16, 2013

How would you round a double value to certain decimal Precision and Scale ?

Firstly let us understand the difference between Precision and Scale.
If the number is 9232.129394, then

Precision represents the number of number of significant digits to which a number is calculated i.e. 4 digits (9232)
Scale represents the number of digits to the right of the decimal point i.e. 6 in above case (129394)

Some other examples are,
Precision 4, scale 2: 99.99
Precision 10, scale 0: 9999999999
Precision 8, scale 3: 99999.999
Precision 5, scale -3: 99999000
No one wants to loose the precision of the number as it will change the value by large amount. If you still want to loose the precision simply divide the number by 10 to the power precision.
There are multiple ways in Java to round the double value to certain scale, as mentioned in the below example

import java.math.BigDecimal;
import java.math.RoundingMode;
import java.text.DecimalFormat;
public class RoundDouble {
    public double round1(double input, int scale) {
        BigDecimal bigDecimal = new BigDecimal(input).setScale(scale, RoundingMode.HALF_EVEN);
        return bigDecimal.doubleValue();
    }
    public double round2(double input) {
        return Math.round(input * 100) / 100.0d;
    }
    public double round3(double input) {
        DecimalFormat df = new DecimalFormat("#.00");
        return Double.parseDouble(df.format(input));
    }
    public static void main(String[] args) {
        RoundDouble rd = new RoundDouble();
        System.out.println(rd.round1(9232.129394d, 2));
        System.out.println(rd.round2(9232.129394d));
        System.out.println(rd.round3(9232.129394d));
    }
}

The first method of rounding using BigDecimal should be preferred in most scenarios.

Sunday, June 9, 2013

Given a collection of 1 million integers, All ranging between 1 to 9, how would you sort them in Big O(n) time ?

This is a typical Integer Sorting problem with a constraint that the number range to sort is very limited in spite 1 million total entries. Integer Sorting with limited range is achieved efficiently with Bucket Sorting.


TIP- What does Wiki Says about Sorting ?

Bucket sort, counting sort, radix sort, and van Emde Boas tree sorting all work best when the key size is small; for large enough keys, they become slower than comparison sorting
algorithms…

Integer Sorting Techniques : http://en.wikipedia.org/wiki/Integer_sorting#Algorithms_for_few_items
Sorting Algorithms : http://en.wikipedia.org/wiki/Sorting_algorithm


Algorithm

Create a array of size 9 and at each index store the occurrence count of the respective integers. Doing this will achieve this sorting with time complexity of Big O(n) and even the memory requirements are minimized. In Order to print the output just traverse the above created array.

public class BucketSort {
    public int[] sort(int[] array, int min, int max) {
        int range = max - min + 1;
        int[] result = new int[range];
        for (int i: array) {
            result[i]++;
        }
        return result;
    }
}

public class BucketSortTest {@
    Test
    public void testBucketSortFor1To9() {
        int[] array = {
            2, 1, 5, 1, 2, 3, 4, 3, 5, 6, 7, 8, 5, 6, 7, 0
        };
        int[] sort = new BucketSort().sort(array, 0, 8);

        for (int i = 0; i & lt; sort.length; i++) {
            for (int j = 0; j & lt; sort[i]; j++) {
                System.out.println(i);
            }
        }
    }
}

Program output : 0,1,1,2,2,3,3,4,5,5,5,6,6,7,7,8

Tuesday, June 4, 2013

Removing elements while Iterating over a Collection ?

Intent here is to check if you are aware of technique of modifying the collection structure while iterating over it. If we call collection.remove() from within the for loop then ConcurrentModificationException will be thrown by the JVM at runtime.


So lets take code snippet the given method
/***Failing Program, Never call Collection.remove(Object) while iterating***/

import java.util.ArrayList;
import java.util.List;

import static java.util.Arrays.asList;

public class Test {
    public void removeFromCollection(List marks) {
        for (Integer mark : marks) {
            if (mark < 40)
                marks.remove(mark); ==> Will throw java.util.ConcurrentModificationException
        }
    }

    public static void main(String[] args) {
        Test test = new Test();
        test.removeFromCollection(new ArrayList(asList(10,20,50,60)));
    }
}

Actually, the right way to handle such scenario is to use Iterator to remove the element from the underlying Collection while iterating over it. ConcurrentModificationException is thrown because the for loop internally creates a fail-fast iterator which throws exception whenever it finds any structural modification in the underlying data structure (ArrayList in this case).

The correct implementation for removal method would look something like,
public void removeFromCollection(List marks) {
        for (Iterator iterator = marks.iterator(); iterator.hasNext(); ) {
            Integer mark = iterator.next();
            if (mark < 40)
                iterator.remove();
        }
    }

Saturday, June 1, 2013

What do you understand by Token Bucket Algorithm. What are its applications ?

Token Bucket Algorithm


Token bucket algorithm is used to define the upper limits on bandwidth and burstiness on the data transmission in a software application. The token bucket algorithm is based on an analogy of a fixed capacity bucket into which tokens, normally representing a unit of bytes or a single packet of predetermined size, are added at a fixed rate.

Applications

1.) To provide download bandwidth limits in software applications like torrent & download managers.

2.) To control the download speed on 3G network by our cellular provider.


Implementation

Lets try to create an implementation for this algorithm. We will choose a Leaky Bucket Implementation, where a fixed amount of tokens are filled after a predefined interval into the bucket. If no one utilizes those token, then they do not get accumulated over time, they just over flow after the capacity of bucket is reached. Let's name this strategy as FixedIntervalRefillStrategy.


Our TokenBucket Class will have following properties


1. ) Refill Strategy

2. ) Maximum Capacity of Tokens - this is the maximum amount of tokens that a client can ask for, otherwise an exception is thrown.

3.) Size - it is the current size of the bucket which will keep on changing as it is refilled after specific interval and emptied by the clients.



TokenBucket's consume() method accepts the number of tokens to consume. This method will then remove those number of Tokens from the bucket, refilling the bucket if required. This method utilizes CAS (CompareAndSet) operation of AtomicLong to make the resize operation atomic so that no-locking is required. This will make the class thread-safe when multiple threads will simultaneously demand for the tokens.



public class TokenBucket {
    private final RefillStrategy refillStrategy;
    private final long capacity;
    private AtomicLong size;

    public TokenBucket(long capacity, RefillStrategy refillStrategy) {
        this.refillStrategy = refillStrategy;
        this.capacity = capacity;
        this.size = new AtomicLong(0L);
    }

    public void consume(long numTokens) throws InterruptedException {
        if (numTokens < 0)
            throw new RuntimeException("Number of tokens to consume must be positive");
        if (numTokens >= capacity)
            throw new RuntimeException("Number of tokens to consume must be less than the capacity of the bucket");

        long newTokens = Math.max(0, refillStrategy.refill());
        while (!Thread.currentThread().isInterrupted()) {
            long existingSize = size.get();
            long newValue = Math.max(0, Math.min(existingSize + newTokens, capacity));
            if (numTokens <= newValue) {
                newValue -= numTokens;
                if (size.compareAndSet(existingSize, newValue))
                    break;
            } else {
                Thread.sleep(refillStrategy.getIntervalInMillis());
                newTokens = Math.max(0, refillStrategy.refill());
            }
        }
    }


@Override
 public String toString() {
     return "Capacity : " + capacity + ", Size : " + size;
 }
}


public static interface RefillStrategy {
     long refill();
     long getIntervalInMillis();
 }


public final class TokenBuckets {

    private TokenBuckets() {}

    public static TokenBucket newFixedIntervalRefill(long capacityTokens, long refillTokens, long period, TimeUnit unit)
    {
        TokenBucket.RefillStrategy strategy = new FixedIntervalRefillStrategy(refillTokens, period, unit);
        return new TokenBucket(capacityTokens, strategy);
    }

}

public class FixedIntervalRefillStrategy implements TokenBucket.RefillStrategy {
    private final long numTokens;
    private final long intervalInMillis;
    private AtomicLong nextRefillTime;

   public FixedIntervalRefillStrategy(long numTokens, long interval, TimeUnit unit) {
        this.numTokens = numTokens;
        this.intervalInMillis = unit.toMillis(interval);
        this.nextRefillTime = new AtomicLong(-1L);
    }

    public long refill() {
        final long now = System.currentTimeMillis();
        final long refillTime = nextRefillTime.get();
        if (now < refillTime) {
            return 0;
        }

        return nextRefillTime.compareAndSet(refillTime, now + intervalInMillis) ? numTokens : 0;
    }

    public long getIntervalInMillis() {
        return intervalInMillis;
    }

}

API Client User

TokenBucket bucket = TokenBuckets.newFixedIntervalRefill(1024 * 10, speedLimitKBps, 1, TimeUnit.SECONDS); 


 
References
http://en.wikipedia.org/wiki/Token_bucket

Tuesday, May 28, 2013

Download a file from http URL using Java FileChannel - an efficient NIO implementation using Java 7

Java's Channel should always be preferred for IO related stuff because Channel can utilize OS specific optimization while dealing with the files. An input stream can easily be converted to a FileChannel using Channels.newChannel() static factory method.

package org.shunya.power.interview.design14;

import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.URI;
import java.nio.channels.Channels;
import java.nio.channels.FileChannel;
import java.nio.channels.ReadableByteChannel;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.util.EnumSet;

public class RealHttpDownloader implements HttpDownloader {
    @Override
    public File download(URI uri, String fileName) throws IOException {
        Path path = Paths.get(fileName);
        long totalBytesRead = 0L;
        HttpURLConnection con = (HttpURLConnection) uri.resolve(fileName).toURL().openConnection();
        con.setReadTimeout(10000);
        con.setConnectTimeout(10000);
        try (ReadableByteChannel rbc = Channels.newChannel(con.getInputStream());
             FileChannel fileChannel = FileChannel.open(path, EnumSet.of(StandardOpenOption.CREATE, StandardOpenOption.WRITE));) {
            totalBytesRead = fileChannel.transferFrom(rbc, 0, 1 << 22); // download file with max size 4MB
            System.out.println("totalBytesRead = " + totalBytesRead);
            fileChannel.close();
            rbc.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
        return path.toFile();
    }
}


package org.shunya.power.interview.design14;

import java.io.File;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URI;

public interface HttpDownloader {
    public File download(URI uri, String fileName) throws IOException;
}


FileChannel ustilizes OS specific optimization and hence should provide better performance in general compared to any buffered streams.

Efficient way to calculate Checksum of a file in using Java FileChannel

Few of the times we wish the speed of C and syntax of Java for doing some IO intensive task in Java. Calculation of CRC is one of them task which requires a efficient implementation in order to give good performance.

public static long calculateCRC(File filename) {
     final int SIZE = 16 * 1024;
     try (FileInputStream in = new FileInputStream(filename);) {
         FileChannel channel = in .getChannel();
         CRC32 crc = new CRC32();
         int length = (int) channel.size();
         MappedByteBuffer mb = channel.map(FileChannel.MapMode.READ_ONLY, 0, length);
         byte[] bytes = new byte[SIZE];
         int nGet;
         while (mb.hasRemaining()) {
             nGet = Math.min(mb.remaining(), SIZE);
             mb.get(bytes, 0, nGet);
             crc.update(bytes, 0, nGet);
         }
         return crc.getValue();
     } catch (FileNotFoundException e) {
         e.printStackTrace();
     } catch (IOException e) {
         e.printStackTrace();
     }
     throw new RuntimeException("unknown IO error occurred ");
 }

Java's FileChannel's provide much better performance than the BufferedInputStream and RandomAccessFile classes.

Monday, May 20, 2013

What is Immutable Class. Why would you choose it ? How would you make a class immutable ?


What is Immutable Object

When the state of object can not be changed after its construction then the object is called Immutable.


Why do we need it

Immutable objects are inherently thread-safe, thus help writing multi-threading code without much worries. Immutable questions are meant for multi-threading program. If someone is talking bout immutability then indirectly he is talking about multi-threaded context. Immutable classes are easy to understand, as they possess a single state, which is controlled by their constructor. Immutable objects are good candidate for hash keys because their hashcode can be cached and reused for better performance.


Which Objects should be Immutable

Immutable classes are ideal for representing ADT's (Abstract Data Type) value.
Joshua Bloch suggests that
"All classes should be designed to be immutable unless there is a specific reason not to do so"


Guidelines for Making a class Immutable

1. All fields should be declared final
2. Class itself is declared final so that the derived classes do not make it Mutable.
3. this reference should not be allowed to escape during object construction such as in anonymous inner classes (for example adding action listener)
4. Any field that contains reference to mutable objects (such as arrays, collections, StringBuffer, etc)
i. Are private
ii. Are never returned or exposed to the caller
iii. Are the only reference to the Objects that they refer
iv. Do not change the state of the referenced object after the construction.
v. If mutable fields must be returned to the caller, then a defensive copy should be returned so that the changes do not reflect in the inner data structure.

public List getList() {
     return Collections.unmodifiableList(list); <=== defensive copy of the mutable field before returning it to caller
}
vi. If a mutable Object is passed in the constructor (like an array), then Immutable class should first make a defensive copy of the mutable object before storing its reference.

What is Double Checked Locking Problem in Multi-Threading ?



Double-Checked Locking Problem
In earlier times (prior to JDK 1.6) a simple uncontended synchronization block was expensive and that lead many people to write double-checked locking to write lazy initialization code. The double-checked locking idiom tries to improve performance by avoiding synchronization over the common code path after the helper is allocated. But the DCL never worked because of the limitations of pervious JMM. 

This is now fixed by new JMM (JDK 1.5 onwards) using volatile keyword.
public class Singleton {
    private Singleton() {}
    private static Singleton instance_ = null; == > A global static variable that will hold the state
    public static Singleton instance() {
        if (instance_ == null){ == > un - synchronized access to this fields may see partially constructed objects because of instruction reordering by the compiler or the cache
            synchronized(Singleton.class) {
                if (instance_ == null)
                    instance_ = new Singleton();
            }
        }
        return instance_;
    }
}
JMM will not guarantee the expected execution of this static singleton.


Why above code idiom is broken in current JMM ?
DCL relies on the un synchronized use of _instance field. This appears harmless, but it is not. Suppose Thread A is inside sycnhronized block and it is creating new Singleton instance and assigning to _instance variable, while thread B is just entering the getInstance() method. Consider the effect on memory of this initialization. Memory for the new Singleton object will be allocated; the constructor for Singleton will be called, initializing the member fields of the new object; and the field resource of SomeClass will be assigned a reference to the newly created object. There could be two scenarios now


• Suppose Thread A has completed initialization of _instance and exits synchronized block as thread B enters getInstance(). By this time, the _instance is fully initalized and Thread A has flushed its local memory to main memory (write barriers). Singleton's member fields may refer other objects stored in memory which will also be flushed out.. While Thread B may see a valid reference to the newly created _instance, but because it didn't perform a read barrier, it could still see stale values of _instance's member fields.

• Since thread B is not executing inside a synchronized block, it may see these memory operations in a different order than the one thread A executes. It could be the case that B sees these events in the following order (and the compiler is also free to reorder the instructions like this): allocate memory, assign reference to resource, call constructor. Suppose thread B comes along after the memory has been allocated and the resource field is set, but before the constructor is called. It sees that resource is not null, skips the synchronized block, and returns a reference to a partially constructed Resource! Needless to say, the result is neither expected nor desired.


Fixed double-checked Locking using volatile in new JMM (multi-threaded singleton pattern JDK 1.5)

The following code makes the helper volatile so as to stop the instruction reordering. This code will work with JDK 1.5 onwards only.
class Foo {
    private volatile Helper helper = null;
    public Helper getHelper() {
        if (helper == null) {

            synchronized(this) {
                if (helper == null)
                    helper = new Helper();
            }
        }
        return helper;
    }

}
If Helper is an immutable object, such that all of the fields of Helper are final, then double-checked locking will work without having to use volatile fields. The idea is that a reference to an immutable object (such as a String or an Integer) should behave in much the same way as an int or float; reading and writing references to immutable objects are atomic.


Alternatives to DCL2
Now a days JVM is much smarter and the relative expense of synchronized block over volatile is very less, so it does not really make sense to use DCL for performance reasons. The easiest way to avoid DCL is to avoid it. We can make the whole method synchronized instead of making the code block synchronized.
Another option is to use eager initialization instead of lazy initialization by assigning at the creation time Here is the example demonstrating eager initialization


class MySingleton {
    public static Resource resource = new Resource();
}


Using Initialization On Demand Holder idiom

Inner classes are not loaded until they are referenced. This fact can be used to utilize inner classes for lazy initialization as shown below

public class Something {
    private Something() {}
    private static class LazyHolder {
        private static final Something INSTANCE = new Something();
    }
    public static Something getInstance() {
        return LazyHolder.INSTANCE;
    }
}

And finally using Enum for Thread-Safe Singleton

public enum Singleton{
    INSTANCE;
}

Friday, May 17, 2013

There is a stream of words which contains Anagrams. How would you print anagrams in a single bucket from that stream ?

Sort each word and then see if the two words are equal ? abba, baab, abab should go to the same bucket.
Simple method to check if two Strings are anagrams

public boolean isAnagram(String s1, String s2){ 
char[] a1 = s1.toCharArray(); 
char[] a2 = s2.toCharArray(); 
Arrays.sort(a1); 
Arrays.sort(a2); 
if (Arrays.toString(a1).equals(Arrays.toString(a2))){ 
return true; 
} 
return false; 
}
Algorithm
1) Use a hashmap with string as key and list<string> as value where list of strings contain all anagrams of a given key string.
2) For each word in the input array, create a key by sorting the word and put this word to that list whose key is the sorted word. for example [aakk -> akka, akak] If it does not exist then create a new list with the sorted word as key in map.

3) Print all strings from the list whose key is the input word(sorted string).
Source Code

import java.util.*;

public class Anagrams {
    private static Map<String,List<String>> anagramsMap = new HashMap<>(100);
    public static void main(String[] args) {

        String[] input = {
            "akka", "akak", "baab", "baba", "bbaa"
        };

        for (String s: input) {
            char[] word = s.toCharArray();
            Arrays.sort(word);
            String key = String.valueOf(word);
            if (!anagramsMap.containsKey(key)) {
                anagramsMap.put(key, new ArrayList < String > ());
            }

            anagramsMap.get(key).add(s);
        }

        System.out.println("anagramsMap = " + anagramsMap);
    }
}
Time Complexity
If we ignore the time consumed by sorting an individual string then we can say that the above approach takes Big O(n) time complexity. Otherwise the actual time complexity would be N log N (sorting) + N (compare)

What is difference between Callable and Runnable Interface ?

As per Java documentation 

"Callable interface is similar to Runnable, in that both are designed for classes whose instances are potentially executed by another thread. A Runnable, however, does not return a result and cannot throw a checked exception."


public interface Callable { 
 V call() throws Exception; 
}
In order to convert Runnable to Callable use the following utility method provided by Executors class

Callable callable = Executors.callable(Runnable task);

Callable, however must be executed using a ExecutorService instead of Thread as shown below.

result = exec.submit(aCallable).get();

Submitting a callable to ExecutorService returns Future Object which represents the lifecycle of a task and provides methods to check if the task has been completed or cancelled, retrieve the results and cancel the task.

Here is the source for Future Interface


public interface Future { 
  boolean cancel(boolean mayInterruptIfRunning); 
  boolean isCancelled(); 
  boolean isDone(); 
  V get() throws InterruptedException, ExecutionException; 
  V get(long timeout, TimeUnit unit) 
  throws InterruptedException, ExecutionException, TimeoutException;
  }

Tuesday, May 14, 2013

What do you understand by Big O Notation, Why is it important in software development ?

What do you understand by Big O Notation?

Big O Notation is a mechanism used to measure the relative inefficiencies of Algorithms in terms of space and time. It makes us understand how execution time & memory requirements of an algorithm grow as a function of increasing input size. In this notation, O stands for the Order of magnitude.

Constant O(1)
A program whose running time’s order of growth is constant, executes a fixed number of operations to finish the job, thus its running time does not depend on N.


Linear O(N)
Program that spends a constant amount of time processing each piece of input data and thus running time is proportional to the N.


Following are the examples of Big O, in increasing order of their magnitude.
 
# Big O Notation Name Example
1. O (1) Constant-time Searching from a HashMap, check a number for even/odd
2. O (log n) Logarithmic Find an item inside sorted array using Binary Search
3. O (n) Linear Printing all elements from an array
4. O (n log n) LogLinear Sorting using Merge Sort
5. O (n2) Quadratic Bubble Sorting Algorithm
6. O (2n) Exponential Shortest Path Problem Djigstraw Algorithm
7. O (n!) Factorial Solving Travelling Sales Man Problem

Importance of Big O

We should always keep time efficiencies in mind while designing an algorithm using existing data structures, otherwise there could be sever performance penalties for using wrong data structure for a given scenario.

Base of Logarithm is irrelevant in Big O Notation

The base of algorithm is not relevant with respect to the order of growth, since all logarithms with a constant base are all related by a constant proportion, so log N is used when referring to the order of growth. But also note that base in case of exponent matters, because it makes lot of difference.

Time efficiency in Big O notation for few Java Collections

ArrayList (ignoring the time taken by array resize operation)
O(1) for add, size and get
O(n) for toString() method


PriorityQueue
O(1) for peek, element and size
O(log n) for offer, poll, remove() and add
O(n) for remove(Object) & contains(Object)


HashMap (with no collisions)
O(1) for get operation
O(1) for put operation


LinkedList
O(1) for removal
O(1) for add & poll method
O(n) for toString() method


References
http://en.wikipedia.org/wiki/Big_O_notation

Monday, May 13, 2013

Is it possible to write a method in Java which swaps two int or Integer ?

The answer is No.

For knowing the exact answer you must be knowing how Parameter Passing works in Java.
Incase of primitive int
Parameters to the method are passed by value in Java. In case of primitive data types, a copy of the value is passed to the method, so any changes in the method will not reflect in the calling code.


Incase of Integer Wrapper Class
For objects, the reference to the Object are copied by value to the calling method. If we reassign these reference copies then the changes will not be reflected to the method calling this swap(x,y).
/** This code will never work as intended  **/
public void swap(Integer x, Integer y) {
    Integer tmp = x;
    x = y;
    y = tmp;
}


The only way to have this possible was using some kind of setter on Integer class which could have modified the underlying value. But Java declares all Wrapper classes as Immutable for thread-safety perspective, thus there is no way to swap Integers in Java.


TIP-The called method can't change the caller's variable, although for object reference variables, the called method can change the object the variable referred to.

What are Inheritance Stretegies in JPA ?

JPA defines three inheritance strategies namely, SINGLE_TABLE, TABLE_PER_CLASS and JOINED.

Single table inheritance is default, and table per class is optional so all JPA vendors may not support it. JPA also defines mapped super class concept defined through the @MappedSuperClass annotation. A Mapped Super Class is not a persistent class, but allows a common persistable mapping to be defined for its subclasses.

1. Single Table Inheritance
In this inheritance, a single table is used to store all the instances of the entire inheritance hierarchy. The Table will have a column for every attribute of every class in the hierarchy. Discriminator columns identifies which class a particular row belongs.

2. Table Per Class Inheritance
A table is defined for each concrete class in the inheritance hierarchy to store all the attribute of that class and all its super classes.

3. Joined Table
This inheritance replicates the object model into data model. A table is created for each class in the hierarchy to store only the local attributes of that class.

Notes
Question - We want to extract common behavior in a super class in JPA entities but we do not want to have table for that super class. How would you achieve this ?
Answer - If we create a normal class as the super class, then as per JPA specifications, the fields for that class are not persisted in the database tables. We need to create a super class extracting the common fields and then annotate that class with @MappedSuperClass in order to persist the fields of that super class in subclass tables. A mapped super class has no separate table defined for it.


References
http://en.wikibooks.org/wiki/Java_Persistence/Inheritance

What does Collections.unmodifiableCollection() do ? Is it safe to use the collection returned by this method in a multi-threading environment ?

Collections.unmodifiableCollection() returns a unmodifiable dynamic view of underlying data structure. Any attempt direct or via iterator to modify this view throws UnsupportedOperationException, but any changes made in the underlying data structure will be reflected in the view.
This method is no substitute for the other thread safety techniques because iterating over a collection using this view may throw ConcurrentModificationException if original collection is structurally modified during the iteration.



For example, the following code will throw ConcurrentModificationException in the for loop.


public class UnModifiableCollection {
    private List < String > names = new ArrayList < > ();
    public void testConcurrency() {
        names.add("1");
        names.add("2");
        names.add("3");
        names.add("4");
        Collection < String > dynamicView = Collections.unmodifiableCollection(names);
        for (String s: dynamicView) { <= == will
            throw ConcurrentModification in 2nd iteration
            System.out.println("s = " + s);
            names.remove(0); <= == The culprit line modifying the underlying collection
        }
    }

    public static void main(String[] args) {
        UnModifiableCollection test = new UnModifiableCollection();
        test.testConcurrency();
    }

}

 

Hence, external synchronization is must if we are going to modify the underlying collection.

Saturday, May 11, 2013

A Simple Collection of Complex Objects or the Complex Collection of Simple Objects, which one is better ?

This questions seems quite abstract at the very first glance and its little difficult to find the intent of the interviewer here.

Interviewer is definitely asking something about the data structure and algorithms here.

"A Simple Collection of Complex Objects "
As it states that very complex objects with all kind of business logic inside them are stored in a simple collection like an simple Array.

"Complex Collection of Simple Objects"
Objects are simple in this case but the collection holding these objects is complex in nature like auto-balanced binary tree - red black tree holding collection of employee objects (which is just a POJO with minimal or no logic in it)

Complex collections are build to solve problems of performance & scalability. Thus using a red-black tree we can expect a logarithmic time for locating the next smaller sibling which otherwise is not possible using a simple array of complex objects. Thus its a open question for the discussion with interviewer if he looking for performance & scalability or something else ?

Sunday, May 5, 2013

Things to Take care while landing on the next dream Job

Some day or other we face a Situation when the current relationship with the employer can not work anymore due to various reasons. And then we look for a change. But we must carefully consider the following facts while we are on the move -

  1. Many Big IT Companies sell their configuration and support work at the name of development by offering huge remuneration benefits. If you are really looking for a quality work then prefer a first hand recommendation for the project and company, that might give a better insight into actual work. If that is not feasible, schedule an extra round with your new employer to know how your typical day will look like after joining the company.
  2. In today's connected world, most companies work across geographies and the work might require you to stay in office in odd hours (anytime outside 8AM-5PM). Working in odd hours have long term negative effects on our health and thus must be carefully evaluated. Ultimately humans are not on earth just for work (Hindi has a better word for this work : Naukri).
  3. As our work experience grow, we expect some kind of growth in our work environment. Few people want to grow on managerial side, and others might prefer technical or individual role. So we must we firm on our decision while shifting our Job.
  4. Few projects are real mess, full of chaos and bugs. Its always better to start your new Job with a work which is not matured phase. Otherwise we will loose the fun & creativity in the daily work. Reverse engineering is not the good way to learn Project's business. Moreover on one loves to clean shit of others.
  5. Good practices & methodologies tend to reduce the stress in a project thus making our life easier. If a project is too old in technology then there would be loads of boiler plate code, which might cause concerns to you. So make sure you inquire enough about your new Project.

Saturday, April 6, 2013

How does Session handling works in Servlet environment ?


There are multiple ways to handle session by a servlet framework. For example following methods can be used,


  1. Storing Cookies on the client side
  2. URL Rewriting
  3. Hidden form fields


Servlets use cookies as the default mechanism for session tracking, but in case cookies are disabled on the client, Server can use URL re-writing for achieving the same.

When server calls request.getSession(true), then server generates and sends JSESSIONID back to the client for all future session references. JSESSIONID will then be stored by the client and sent back to the server using any of the above mentioned mechanisms.

To ensure that your Servlets support servers that use URL rewriting to track sessions, you must pass all the URL's used in your servlet through the

HttpServletResponse.encodeURL() method like :
out.println("<form actionb ='"+res.encodeURL("/example/htmlpage")+"'>");
This will append the sessionID to the form's action.

Discuss internal's of a Concurrent Hashmap (CHM) in Java 7


In Java 1.7, A ConcurrentHashMap is a hashmap supporting full concurrency of retrieval via volatile reads of segments and tables without locking, and adjustable expected concurrency for updates. All the operations in this class are thread-safe, although the retrieval operations does not depend on locking mechanism (non-blocking). And there is not any support for locking the entire table, in a way that prevents all access. The allowed concurrency among update operations is guided by the optional concurrencyLevel constructor argument (default is16), which is used as a hint for internal sizing.
Fig 1. Internals Implementation diagram for HashMap




ConcurrentHashMap is similar in implementation to that of HashMap, with resizable array of hash buckets, each consisting of List of HashEntry elements. Instead of a single collection lock, ConcurrentHashMap uses a fixed pool of locks that form a partition over the collection of buckets.

Here is the code snippet showing HashEntry class

static final class HashEntry {
        final int hash;
        final K key;
        volatile V value;
        volatile HashEntry next;
...
 
HashEntry class takes advantage of final and volatile variables to reflect the changes to other threads without acquiring the expensive lock for read operations.
The table inside ConcurrentHashMap is divided among Segments (which extends Reentrant Lock), each of which itself is a concurrently readable hash table. Each segment requires uses single lock to consistently update its elements flushing all the changes to main memory.

put() method holds the bucket lock for the duration of its execution and doesn't necessarily block other threads from calling get() operations on the map. It firstly searches the appropriate hash chain for the given key and if found, then it simply updates the volatile value field. Otherwise it creates a new HashEntry object and inserts it at the head of the list.
Iterator returned by the ConcurrentHashMap is fail-safe but weakly consistent. keySet().iterator() returns the iterator for the set of hash keys backed by the original map. The iterator is a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.
Re-sizing happens dynamically inside the map whenever required in order to maintain an upper bound on hash collision. Increase in number of buckets leads to rehashing the existing values. This is achieved by recursively acquiring lock over each bucket and then rehashing the elements from each bucket to new larger hash table.


Question : Is this possible for 2 threads to update the ConcurrentHashMap at the same moment ?

Answer : Yes, its possible to have 2 parallel threads writing to the CHM at the same time, infact in the default implementation of CHM, atmost 16 threads can write in parallel. But in worst case if the two objects lie in the same segment, then parallel write would not be possible. On the other-hand reads are not blocking in nature, so any number of threads can read the data at the same time, also reads can overlap with writes. 

Question : Can multiple threads read from a given Hashtable concurrently ?

Answer : No, get() method of hash table is synchronized (even for synchronized HashMap). So only one thread can get value from it at any given point in time. Full concurrency for reads is possible only in ConcurrentHashMap via the use of volatile.

What are four principles of OOP, How aggregation is different than Composition ?


There are 4 major principles that make an language Object Oriented.  These are Encapsulation, Data Abstraction, Polymorphism and Inheritance.

Encapsulation


Encapsulation is the mechanism of hiding of data implementation by restricting access to public methods.

Abstraction


Abstract means a concept or an Idea which is not associated with any particular instance. Using abstract class/interface we express the intent of the class rather than the actual implementation. In a way, one class should not know the inner details of another in order to use it, just knowing the interfaces should be good enough.

Inheritance


Inheritances expresses "is a" relationship between two objects. Using proper inheritance, In derived classes we can reuse the code of existing super classes.

Polymorphism


It means one name many forms. It is further of two types - static and dynamic. Static polymorphism is achieved using method overloading and dynamic polymorphism using method overriding.

What is aggregation, how is it different from composition ? 


Both of these are special type of association and differ only in weight of relationship.
Composition is stronger form of "is part of" relationship compared to aggregation "has a".
In composition, the member object can not exist outside the enclosing class while same is not true for Aggregation.

What are the key principles when designing a scalable software



  1. Stateless design using REST can help achieve scalability whereever possible. In such application, minimal session elements need to be replicated while distributing the application over multiple hosts. Users can save their favorite URLs and thus there should be no need for the page flow, if we use REST.
  2. Logging can be done asynchronously to save precious time of a method call.
  3. More processes vs more threads can be configured based on the demand of the target application. Generally it is advised to have a JVM with up to 2 GB memory because increasing memory beyond 2 GB incurs heavy GC pauses, and if we require more processing then we prefer to have a separate process for the JVM altogether. Multiple independent tasks should be run in parallel. Tasks can be partitioned to improve the performance.
  4. If we improve upon the concurrency of the software piece, then we can increase its scalability. This can be achieved by reducing the dependency on the shared resources. We should try utilizing the latest hardware optimization through JAVA as much as possible. For example we can use Atomic utilities provided in java.util.concurrent.atomic package, or Fork & Join to achieve higher throughput in concurrent applications. We should try holding the shared locks for as little time as possible. 
  5. Resource pooling and caching can be used to improve the processing time. Executing jobs in batches can further improve the performance.
  6. Picking up appropriate algorithm and data structure for a given scenario can help optimize the processing.
  7. If we are using SQL in our application then we should tune the SQL, use batching whereever possible and create indexes on the essentials table columns for faster retrievals.
  8. We should tune our JVM for optimum memory settings (Heap, PermGen, etc) and Garbage collection settings. For example if we do lot of text processing in our application with big temporary objects being created, then we should have larger Young Generation defined so that frequent gc run does not happen.
  9. Keep up to date with new technologies for performance benefits.


Good resources for a Java Developer


Essential Tool kit for Java Developer

IntelliJ IDE
Java DB
Twitter Bootstrap CSS library
Jquery
Freemarker
Struts 2
Servlets
Tortoise SVN
Apache Web Server
Jetty Server
HTML 5
Firebug extension for mozilla Firefox
Cygwin for Unix simulation

Books 

Design Patterns in Java - Head First
Concurrency In Practice by Brian Goetz
Effective Java 2nd Edition by Joshua Bloch
Algorithms 4th edition : http://algs4.cs.princeton.edu/home/
Cracking the coding interview at CareerCup

Technology Forums

http://www.geeksforgeeks.org/fundamentals-of-algorithms/
http://www.careercup.com
http://www.stackoverflow.com

Great Tutorials Articles

Few articles on Java 6 at IBM website
http://www.ibm.com/developerworks/views/java/libraryview.jsp?search_by=5+things+you+did
Concurrency In Practice by Brian Goetz - http://www.briangoetz.com/pubs.html
Java Articles  : http://www.oracle.com/technetwork/articles/java/index.html
http://www.oracle.com/technetwork/articles/java/index.html
Java SE Tutorial : http://docs.oracle.com/javase/tutorial/index.html
Java 7 docs: http://docs.oracle.com/javase/7/docs/

Video Tutorials

http://www.youtube.com/user/nptelhrd

What is AtomicInteger class and how it's functioning is different from using a volatile or synchronized?

AtomicInteger uses combination of volatile & CAS (compare and swap) to achieve thread-safety for Integer Counter. It is non-blocking in nature and thus highly usable in writing high throughput concurrent data structures that can be used under low to moderate thread contention.

Compare-And-Swap

In computer science, compare-and-swap (CAS) is an atomic instruction used in multi-threading to achieve synchronization. It compares the contents of a memory location to a given value and, only if they are the same, modifies the contents of that memory location to a given new value. This is done as a single atomic operation. The atomicity guarantees that the new value is calculated based on up-to-date information

Thread Contention

Essentially thread contention is a condition where one thread is waiting for a lock/object that is currently being held by another thread. Wwaiting thread cannot use that object until the other thread has unlocked that particular object.

Read & write to volatile variables have same memory semantics as that of acquiring and releasing a monitor using synchronized code block. So the visibility of volatile field is guaranteed by the JMM (Java Memory Model).

AtomicInteger class stores its value field in a volatile variable, thus it is a decorator over the traditional volatile variable, but it provides unique non-blocking mechanism for updating the value after requiring the hardware level support for CAS (compare and set/swap). Under low to moderate thread contention, atomic updates provides higher throughput compared to synchronized blocking increment operation.
Here is the implementation for getAndIncrement() method of AtomicInteger Class (as of Java 7).

public final int getAndIncrement() {
        for (;;) {
            int current = get();
            int next = current + 1;
            if (compareAndSet(current, next))
                return current;
        }
 }

You can see that no lock is acquired to increment the value, rather CAS is used inside infinite loop to update the new value, that’s why it can be used to write scalable application where thread contention is low to medium.

Discuss the effects of a volatile keyword in Java


Volatile is a mechanism for lighter weight synchronization where memory visibility of the protected state is guaranteed to all consecutive threads.

A write to volatile variable not only flush changes of the volatile variable but all the non volatile variables changed before write to volatile. Thus a simple flag of volatile type can serve the memory visibility guarantee of all the other variables changed before. The following figure explain it in entirety.

Non-atomic treatment of long and double


For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.

Writes and reads of volatile long and double values are always atomic.

Writes to and reads of references are always atomic, regardless of whether they are implemented as 32-bit or 64-bit values.

It is safe to perform read-modify-write operations on a shared volatile variable as long as you ensure that the volatile variable is only written from single thread.
volatile variables are liable to get into race conditions because atomicity is required to solve race condition