Java Multithreading Steeplechase: Cancelling Tasks In Executors

In the previous post on stopping threads, we explored thread design strategies to safely stop threads in Java. In this post, let’s look at various ways we can stop or cancel tasks handled by Executors (and ExecutorServices).

Standard Cancellation:

When we submit a task (Callable or its older cousin Runnable) for execution to an Executor or ExecutorService (e.g. ThreadPoolExecutor), we get a Future back which wraps the tasks. It has a method called cancel(…) which can be called to cancel the taskFuture wraps. Calling this method has different effects depending on the stage the task is in. A task could be in three possible stages after being submitted to an Executor:

  1. The task hasn’t started executing yet – it is waiting in the work queue for a thread to start executing it.
  2. A thread is executing the task.
  3. The task has finished executing.

Cancellation is trivial, if the task hasn’t started executed. It is simply removed from the work queue. Similarly, if the task has finished executing, cancelling it has no effect.

It is a little tricky when the task is executing in a thread. Recall from my previous post: to stop threads in Java, we rely on a co-operative mechanism called Interruption. The idea is very simple. To stop a thread, all we can do is deliver it a signal, aka interrupt it, requesting that the thread stops itself at the next available opportunity. If the thread cooperates, it will clean up itself and stop. Non-cooperating threads ignore the request and cancellation will have no effect.

From Javadocs:

boolean cancel(boolean mayInterruptIfRunning)
Attempts to cancel execution of this task. This attempt will fail if the task has already completed, has already been cancelled, or could not be cancelled for some other reason. If successful, and this task has not started when cancel is called, this task should never run. If the task has already started, then the mayInterruptIfRunning parameter determines whether the thread executing this task should be interrupted in an attempt to stop the task.

So when the tasks is already executing and we call cancel(true) on it, it will deliver an interrupt signal to the thread executing the task. In order to make this work properly, your threads must be designed to handle interruption. Refer to this post for more info.

Non-Standard Cancellation:

Sometime, it becomes necessary to support non-standard task cancellation – especially when the task relies on blocking or long-running methods that are oblivious to interruption. E.g. when you call ServerSocket.accept(), it starts waiting for a client connection. The catch-22 is that it will ignore all interruption requests and if this function is called in a thread, you cannot stop that thread using interrupts. To support nonstandard cancellation where interrupts won’t work, there are two ways of doing it. Please remember, in both of the following ways, you will have to do something to cancel the method that ignores interrupts. E.g. in the case of ServerSocket, closes the underlying socket which forces the accept() method to throw an exception.

1. Overriding Thread.interrupt():

Provide a custom ThreadFactory to the ExecutorService. Return custom Threads which override the interrupt method. For example:

Overriding interrupt() method is not recommended.

2. Overriding Future.cancel(…):

In your tasks (e.g. Callable) provide a method for non-standard cancellation, such as cancelTask(). Then override the Future.cancel(…) to call cancelTask().

But think about it: We do not normally create a Future ourselves and specify what the cancel(…) method does: we get Future when we submit a task to an ExecutorService via ExecutorService.submit(…).

Luckily, ExecutorService calls on a method called newTaskFor(Callable c) that returns the Future (or rather RunnableFuture) representing the task. Hence we need to override newTaskFor(…) to return a custom Future which overrides the cancel(…) method. This is shown below with an example:

Whereas the IdentifiableCallable is shown below:

Next we need to define our own FutureWrapper so we can override the cancel(…) method:

Now we need to define our task as follows:

That’s all. Now when we call FutureTaskWrapper.cancel(…), it will in turn call cancelTask(), where we can do our non-standard cancellation.

The entire code used in this post is available on GitHub.

Command Line Flags for the JVM

This post attempts to demystify command-line options of the Java Virtual Machine (JVM). I’m talking about those strange characters you often have to type when starting Java to run a program. The options are often used to specify environment variables (class path), configure  performance characteristics of the JVM (garbage collection frequency), amongst many other things.

For example, a sample of valid command line options are shown below in bold.

java -Xmx6g AClass
java -version
java -Xms4g -Xmx6g SomeClass
java -DSomeVal="foo" MyProgram
java -DSomeVal="foo" -cp MyProgram.jar -Xmx6g -XX:+UseCompressedOops -XX:+UseG1GC com.foo.Bar 

JVM also support many command-line options which allow users to specify:

  • Minimum and Maximum Size of the Heap Memory
  • Type of Garbage Collector
  • Type of JIT Compiler (Client or Server), or,
  • To display JVM version,
  • etc.

Java Language Specification (JLS) breaks the command-line options into three categories based on their maturity level. The most mature options belong to a category called “Standard Options” and must be supported by all JVM’s. Less mature options are called “Non-Standard“. These are specific to JVM (e.g. HotSpot) and are subject to change between releases. There is also a third category of  options called “Developer Options“. Developer options are a shade of Non-Standard.

Quick Recap: JVM command-line options are used to specify configuration settings to control execution of the Virtual Machine. These options are broken into three categories as shown below:

  1. Standard Options
  2. Non-Standard Options
  3. Developer Options (Experimental)

Let’s look at the categories in detail.

1. Standard Options

The Standard Options are regulated by the Java Virtual Machine Specification and must be supported by all implementations of Java Virtual Machines. For example, OpenJDK, HotSpot, etc. Standard Options are stable and do not change between releases. Standard Options begin with a followed by the name of option, e.g. version

Some standard options are shown below:

-version:  java -version

-cp:  java -cp <PATH>

-jar:  java -jar MyProgram.jar

-Dproperty=value:  java -DSomeVal="foo"

2. Non-Standard Options

Non-Standard Options always begin with -X.  The fact that they are not guaranteed to be supported in all implementations of the JVM or even between versions, did little to hurt their popularity. Non-Standard Options remain popular and are widely used. These options often specify an integer value with a suffix of k,m, or g to specify kilo, mega or giga. To get a list of all Non-Standard Options supported by your JVM, you can invoke the launcher with -X, e.g. `java -X`. Examples:

-Xms:  java -Xms2g 

-X:  java -X

3. Developer Options

Developers Options always begin with -XX.  They are also Non-Standard. They follow the following format for setting boolean Options: -XX: followed by either + or to indicate true or false, followed by the name of option.

-XX:+UseCompressedOops (Indicates that the Option UseCompressedOops must be used)

From, Java Documentation:

  • Boolean options are turned on with -XX:+<option> and turned off with -XX:-<option>.

  • Numeric options are set with -XX:<option>=<number>. Numbers can include ‘m’ or ‘M’ for megabytes, ‘k’ or ‘K’ for kilobytes, and ‘g’ or ‘G’ for gigabytes (for example, 32k is the same as 32768).

  • String options are set with -XX:<option>=<string>, are usually used to specify a file, a path, or a list of commands

Examples:

java -XX:+PrintCompilation

java -XX:+ParallelGCThreads=10

Click here for a complete list of Options from Oracle.

Object Naming Conventions in JMX

Every MBean must have a name, more accurately, an ObjectName. Although, MBean could be named anything, E.g. “DogBean” or “SunnyDay”, it is important to choose consistent and well defined names to avoid inflicting mental torture on the poor soul who is interacting with your application via JMX.

Fortunately, MBean names follow some standard conventions and names can determine how Clients display MBeans. MBean names look like this:

domain:key=property

Remember this convention. Read the last line carefully. Notice that there are two parts with a separating them. The first part is called domain and the second part is called key-properties-list.

Here’s an example MBean name with both domain and properties: com.somecompany.app:type=ThreadPool

Domain Naming Conventions

The domain could be any arbitrary string, but it cannot contain a : since it is used as a separator. Slash (/) isn’t allowed as well. If the domain name is not provided, then the MBean shows up under “DefaultDomain“. As mentioned earlier, domain names should be predictable. According to Oracle Technet:

….if you know that there is going to be an MBean representing a certain object, then you should be able to know what its name will be.

And then add further:

The domain part of an Object Name should start with a Java package name. This prevents collisions between MBeans coming from different subsystems. There might be additional text after the package name. Examples:

com.sun.someapp:type=Whatsit,name=5
com.sun.appserv.Domain1:type=Whatever

Key=Property Naming Conventions

The property list is optional and is in the following format:

name=property

If you wish to specify multiple properties, separate each of them by comma(,). For example:

com.somecompany.app:type=ThreadPool,poolname=Parser,scope=internal

They Key=Property should be used to uniquely identify MBean Objects, such as, each Object in the same domain may have different properties. For example:

com.somecompany.app:type=ThreadPool,name=Parser

com.somecompany.app:type=ThreadPool,name=Generator

That’s MBean naming convention from  1000 feet. If you want to get more information, please read here.

Things every Java developer must know about Exception handling

Exceptions are one of the most misunderstood (and misused) features of the Java programming language. This article describes the absolute minimum every Java developer must know about exceptions. It assumes that the reader is somewhat familiar with Java.

Historical Perspective

Back in the heyday of the “C” programming language, it was customary to return values such as -1 or NULL from functions to indicate errors. You can easily see why this isn’t a great idea – developers had to check and track possible return values and their meanings: a return value of 2 might indicate “host is down” error in library A, whereas in library B, it could mean “illegal filename”.

Attempts were made to standardize error checking by expecting functions to set a global variable with a defined value.

deleteme

James Gosling and other designers of the language felt that this approach would go against the design goals of Java. They wanted:

  1. a cleaner, robust and portable approach
  2. built in language support for error checking and handling.

Luckily, they didn’t have to look too far. The inspiration for handling errors came from a very fine language of the 60’s: LISP.

Exception Handling

So what is exception handling? It is unconventional but simple concept: if an error is encountered in a program, halt the normal execution and transfer control to a section specified by the programmer. Let’s look at an example:

try {
   f = new File("list.txt"); //Will cause an error if the file is not found...
   f.readLine;
   f.write("another item for the list");
   f.close();
} catch (FileNotFoundException fnfe) { // ... and transfer control to this section on error.
   // Do something with the error: notify user or try reading another location, etc

}

Exceptions are exceptional conditions that violate some kind of a “contract” during program execution. They can be thrown by the language itself (e.g. use a null reference where an object is required) or by the developers of program or API (e.g. passing date in British format instead of American). Some examples of exceptions are:

  • Accessing index outside the bounds of an array
  • Divide by 0
  • Programmer defined contract: Invalid SQL or JSON format

Exceptions disrupt the normal program flow. Instead of executing the next instruction in the sequence, the control is transferred to the Java Virtual Machine (JVM) which tries to find an appropriate exception handler in the program and transfer control to it (hence disrupting the normal program flow).

Checked and Unchecked Exceptions

Before we look at the exception classes in Java, let’s understand the two categories of exceptions in Java:

Checked exceptions – You must check and handle these in your program. For example, if you are using an API that has a method which declares that it could throw a checked exception, you must catch the exception each time you call that method. If you don’t, the compiler will notice and your program will not compile. The designers of the Java wanted to encourage developers to use checked exceptions in situations from which programs may wish to recover: for example, if the host is down, the program may wish to try another address.

Unchecked exceptions on the other hand are not required to be handled or caught in the program. For example, if a method could throw unchecked exceptions, the caller of the method is not required to handle or catch the exceptions.

Remember: Checked exceptions are mild and normally programs wish to recover. They must be caught and this rule is enforced by the compiler. The compiler doesn’t care whether you do or do not catch unchecked exceptions.

Many people find dichotomy between checked and unchecked exceptions confusing and counter-intuitive. Discussing the arguments from both sides are beyond the scope of this post.

Parent of all exception classes: Throwable

All exceptions in Java descend (subclass) from Throwable . It has two direct children:

  1. Exception
  2. Error

Error and its sub-classes are used  for serious errors from which programs are not expected to recover,  i.e. unchecked exception.

Exception and its sub-classes are used for mild errors from which programs may wish to recover, i.e. checked exception. Right? Well, there is a twist. There is just one sub-class which is different, that is, unlike it’s parent the Exception class, it is unchecked. It’s called the RuntimeException.

deleteme

 

Checked exception classes (mostly): Exception

Exception and its sub-classes must be caught and as such they force the programmer to think (and hopefully) deal with the situation. It is a signal that something didn’t go as intended along with some information about what went wrong, and that “someone” should do something about it. (e.g. car’s dashboard indicating that the battery needs service).

According to official documentation:

These are exceptional conditions that a well-written application should anticipate and recover from. For example, suppose an application prompts a user for an input file name,  [..] But sometimes the user supplies the name of a nonexistent file, and the constructor throws java.io.FileNotFoundException. A well-written program will catch this exception and notify the user of the mistake, possibly prompting for a corrected file name.

Source: The Java Tutorials

RuntimeException

RuntimeExceptions are used to indicate programming errors, most commonly violation of some established contract. They make it impossible to continue further execution.

For example, the contract says that the array index mustn’t go past [array_length – 1]. If you do it, bam, you get a RuntimeException. A real world analogy would be pumping diesel into a gasoline car: the unwritten contract says that you must not do it. There are no  signals, just the white smoke before the car comes to a grinding halt after a while. The message: it was your fault and could’ve been prevented by being smarter in the first place.

These are exceptional conditions that are internal to the application, and that the application usually cannot anticipate or recover from. These usually indicate programming bugs, such as logic errors or improper use of an API.

Source: The Java Tutorials

Error

These exceptional circumstances are like “act-of-god” events. Going back to our previous analogy, if a large scale alien invasion were to happen, there is nothing you could do your protect your car, or yourself (unless your last name is Ripley). In Software world, this amounts to the disk dying while you are in the process of reading a file from it. The bottom line is that you should not design your program to handle Errors since something has gone wrong in the grand scheme of things that are beyond your control.

These are exceptional conditions that are external to the application, and that the application usually cannot anticipate or recover from. For example, suppose that an application successfully opens a file for input, but is unable to read the file because of a hardware or system malfunction.

Source: The Java Tutorials

It’s not so black and white

Checked exceptions are often abused in Java. While Java forces developers to catch unchecked exceptions, it cannot force them to handle these exceptions. It’s not hard to find statements like this even in well written programs:

try {
   Object obj = ...
   Set<String> set = ...
   // perform set operations
} catch (Exception e) {
   // do nothing
}

Should you ever catch Runtime Exceptions?

What’s the point of catching RuntimeExceptions if the condition is irrecoverable? After all, if you were catching every possible run-time exception, your program will be cluttered with exception handling code everywhere.

RuntimeExceptions are rare errors that could be prevented by fixing your code in the first place. For example, dividing a number by 0 will generate a run time exception, ArithmeticException. But rather than catching the error, you could modify your program to check the arguments for division function and make sure that the denominator > 0. If it is not, we can halt further execution or even dare to throw a exception of our own: IllegalArgumentException.

In this case, the program got away by verifying the input parameters instead of catching RuntimeExceptions.

So when is it OK for an application to catch RuntimeExceptions?

A while back, I architected a high-performance traffic director with the goal of operating in the proximity of 10,000 transactions per seconds (TPS). The project had a very high availability criteria and one of the requirement was that it “must-never-exit”.

The director performs minimum amount of processing on each transaction before passing it further. Transactions came in two flavours, call them: A and B. We were only interested in transactions of type A. We had a transactions handler to process type A. Naturally, it “choked” run time exceptions when we passed in transactions of type B. The solution? Create a function and pass it every single transaction. If it returned true, we continued to further processing. Otherwise, we simply ignored the transaction, and continued onto the next one.

boolean checkFormat(Transaction t) {
//return true if the t is of type A. false otherwise.
}

This worked well, except…..

… the analysis showed that this function returned false only once a year. The reason, 99.99999999999999% transactions were of type A. Yet, we were subjecting every single transaction to be checked. This does not sound so bad, but due to the nature of transactions, the only way to differentiate was by doing expensive String comparison on various fields.

When this finding was brought to my knowledge, I immediately had the `checkFormat(…)` function removed and instead let the handler do it’s course and throw RuntimeException upon encountering transaction of type, B. When the exception gets thrown once a year, we catch it, log it and move onto the next transaction. The result: improvement in performance, and room to squeeze in additional calculations.

Summary

Exceptions in java are either checked or unchecked. Checked exceptions must be caught in the program otherwise the compiler will complain. While Java encourages developers to follow certain guidelines when it comes to exception handling, there aren’t any hard and fast rules and the rules are often bent.

Java Multithreading Steeplechase: Executors

Historical Perspective on Tasks & Threads

Tasks are activities that perform some action or do calculations. For example a task could calculate prime numbers up to some upper limit. Good tasks do not depend on other tasks: they are independent. In this post, when I refer to tasks, I would mean tasks that are independent.

Tasks in Java can be represented by a very simple interface called Runnable that has only one method: run(). The singular function neither returns a value nor can throw checked exceptions.

public interface Runnable {
    void run();
}

Many new comers to Java presume Threads to be the primary abstraction for running tasks. This means that a task can be submitted to a thread which then runs the task. In fact, the Thread class has constructors which take a Runnable for execution:

Thread(Runnable target)
Thread(Runnable target, String name)
Thread(ThreadGroup group, Runnable target)
...

There are obvious benefits in segregating tasks and threads.

A Task, defined by implementing Runnables, is submitted to Thread for execution. The Thread doesn’t know anything about the task and the same thread could run several different tasks.

Enter Executor:

Executor was introduced in Java 1.5 as a clean abstraction for executing tasks. Mantle was passed to Executor from Thread. According to the Java API, an Executor:

“… executes submitted Runnable tasks.  This interface provides a way of decoupling task submission from the mechanics of how each task will be run, including details of thread use, scheduling, etc.” In essence, Executor is an interface, whose simplicity rivals that of Runnable:

public interface Executor {
    void execute(Runnable command);
}

The ‘very simple’ Executor interface forms basis for a very powerful asynchronous task execution framework. It is based on a Producer-Consumer pattern: Producers produce tasks and Consumer threads execute these tasks.

ExecutorService

There is little chance you will use ever use Executor directly. It is very powerful, yet feature starved interface with a lone method for executing tasks. Fortunately for us, it has a famous child called ExecutorService, which provides lifecycle support such as shutdown, task tracking and the ability to retrieve results.

Tracking Task Progress via Future

ExecutorService defines a method called `submit(Runnable task)` which returns a `Future` that can be used to track task’s progress and cancel it (if desired). Future is an interface. From its javadocs:

“A Future represents the result of an asynchronous computation. Methods are provided to check if the computation is complete, to wait for its completion, and to retrieve the result of the computation. The result can only be retrieved using method get when the computation has completed, blocking if necessary until it is ready. Cancellation is performed by the cancel method. Additional methods are provided to determine if the task completed normally or was cancelled. Once a computation has completed, the computation cannot be cancelled.”

RunnableFuture

Earlier on, I said that the interface Runnable doesn’t return a value. Runnable tasks can indicate completion by modifying a shared data structure. RunnableFuture implements both Future and Runnable interfaces. It can be submitted to any method which expects Runnable and the Future allows to access its result.

So far we have only discussed interfaces (Executor, ExecutorService and Future). Before we look into concrete classes, let us consider one very important concept.

Thread Pool

A  design pattern: http://en.wikipedia.org/wiki/Thread_pool_pattern. It has a task queue which holds incoming tasks and has a pool of thread which takes tasks from the queue and execute them.

thread pool

A sample thread pool (green boxes) with waiting tasks (blue) and completed tasks (yellow)

Benefits of Thread Pools are thread re-use (creating new threads is a significant CPU overhead) and improved responsiveness (there may already be a waiting thread when a task arrives).

Now let us discuss concrete classes.

AbstractExecutorService

This is a skeletal implementation for ExecutorService, providing default implementations for some of it’s methods.

public abstract class AbstractExecutorService
implements ExecutorService

ThreadPoolExecutor

This is an ExecutorService that applies the Thread Pool pattern to execute tasks. From its javadocs:

“An ExecutorService that executes each submitted task using one of possibly several pooled threads, normally configured using Executors factory methods.” It provides several methods for setting pool and task queue sizes. For more information:

public class ThreadPoolExecutor extends AbstractExecutorService
implements Executor, ExecutorService

FutureTask

Provides an implementation of Future and RunnableFuture. From its javadoc:

“…provides a base implementation of Future, with methods to start and cancel a computation, query to see if the computation is complete, and retrieve the result of the computation.”

Since a FutureTask implements RunnableFuture, you can submit it directly to an ExecutorService.

Callable:

Callable‘s were introduced in Java 5 as the next version of Runnable. Just like Thread passed mantle to Executor for task execution, Runnable passed mantle to Callable for representing tasks.

Callable for used to represent tasks. Unlike Runnable’s, they can return a value and even throw Exceptions. They even support generics.

Summary:

Executor and ExecutorService form a very powerful framework for asynchronous task execution. Future is a wrapper that provides a way to track a task’s progress and could be used to cancel it. Callable represents a task and allows the task to return a value and throw exceptions.

So you might ask why do we still have Threads and Runnables if we have better choices available, in the form of Executor and Callable. As far as Callable Vs Runnable is concerned, the reason is purely backwards compatibility. Threads are not languishing in Java. ExecutorService simply provides a cleaner abstraction for executing tasks. They still rely on Threads to execute these tasks.

Java Multithreading Steeplechase: Stopping Threads

Let us cut to the chase: In Java, there is no way to quickly and reliably stop a thread.

Java language designers got drunk once and attempted to support forced thread termination by releasing the following methods: `Thread.stop()`, `Thread.suspend()` and `Thread.resume()`. However, when they become sober, they quickly realized their mistake and deprecated them. Abrupt thread termination is not so straight forward. A running thread, often called by many writers as a light-weight process, has its own stack and is the master of its own destiny (well daemons are). It may own files and sockets. It may hold locks. Termination is not always easy: Unpredictable consequences may arise if the thread is in the middle of writing to a file and is killed before it can finish writing. Or what about the monitor locks held by the thread when it is shot in the head? For more information on why `Thread.stop()` was deprecated, follow this link: http://docs.oracle.com/javase/6/docs/technotes/guides/concurrency/threadPrimitiveDeprecation.html

Anyways, back to the point.

In Java, there is no way to quickly and reliably stop a thread.

To stop threads in Java, we rely on a co-operative mechanism called Interruption. The concept is very simple. To stop a thread, all we can do is deliver it a signal, aka interrupt it, requesting that the thread stops itself at the next available opportunity. That’s all. There is no telling what the receiver thread might do with the signal: it may not even bother to check the signal; or even worse ignore it.

Once you start a thread, nothing can (safely) stop it, except the thread itself. At most, the thread could be simply asked – or interrupted – to stop itself.

Hence in Java, stopping threads is a two step procedure:

  • Sending stop signal to thread – aka interrupting it
  • Designing threads to act on interruption

A thread in Java could be interrupted by calling `Thread.interrupt()` method. Threads can check for interruption by calling `Thread.isInterrupted()` method. A good thread must check for interruption at regular intervals. The following code fragment illustrates this:

public static void main(String[] args) throws Exception {

        /**
         * A Thread which is responsive to Interruption.
         */
        class ResponsiveToInterruption extends Thread {
            @Override public void run() {
                while (!Thread.currentThread().isInterrupted()) {
                    System.out.println("[Interruption Responsive Thread] Alive");
                }
                System.out.println("[Interruption Responsive Thread] bye**");

            }
        }

        /**
         * Thread that is oblivious to Interruption. It does not even check it's
         * interruption status and doesn't even know it was interrupted.
         */
        class ObliviousToInterruption extends Thread {
            @Override public void run() {
                while (true) {
                    System.out.println("[Interruption Oblivious Thread] Alive");
                }
                // The statement below will never be reached.
                //System.out.println("[Interruption Oblivious Thread] bye");
            }
        }

        Thread theGood = new ResponsiveToInterruption();
        Thread theUgly = new ObliviousToInterruption();

        theGood.start();
        theUgly.start();

        theGood.interrupt(); // The thread will stop itself
        theUgly.interrupt(); // Will do nothing
}

 

[Interruption Oblivious Thread] Alive
[Interruption Responsive Thread] Alive
[Interruption Responsive Thread] Alive
[Interruption Oblivious Thread] Alive
[Interruption Responsive Thread] bye**
[Interruption Oblivious Thread] Alive
[Interruption Oblivious Thread] Alive
[Interruption Oblivious Thread] Alive
[Interruption Oblivious Thread] Alive
....

A well designed thread checks its interrupt status at regular intervals and take action when interrupted, usually by cleaning and stopping itself.

Blocking Methods and Interruption:

A thread can check for interruption at regular intervals – e.g. as a loop condition – and take action when it is interrupted. Life would have been easy if it weren’t for those pesky blocking methods: these methods may “block” and take a long time to return, effectively delaying the calling thread’s ability to check for interruption in a timely manner. Methods like `Thread.sleep()`, `BlockingQueue.put()`, `ServerSocket.accept()` are some examples of blocking methods.

If the code is waiting on a blocked method, it may not check the interrupt status until the blocking method returns.

Blocking methods which support interruption usually throw an Exception when they detect interruption, transferring the control back to the caller. Blocking methods either throw InterruptedException or ClosedByInterruptionException to signal interruption to the caller. Let us consider an example. the code below calls `Thread.sleep()`. When it detects interruption, `Thread.sleep()` throws InterruptedException and the caller exits the loop. All blocking methods which throw InterruptedException also clear the interrupted status. You must either act on interruption when you catch this exception or at the very least, set the interrupted status again to allow the code higher up the stack to act on interruption.

   @Override
   public void run() {
        while(true) {
            try {
                Thread.sleep(Long.MAX_VALUE);
            } catch (InterruptedException exit) {
                break; //Break out of the loop; ending thread
            }
        }
    }

This may sound preposterous, but the code that does nothing on InterruptedException is “swallowing” the interruption, denying other code to take action on interruption.

Interruption Oblivious Blocking Methods:

In the first code example in this post, we have two threads, ResponsiveToInterruption and ObliviousToInterruption. The former checked for interruption regularly – as loop condition – whereas the later didn’t even bother to check. Blocking methods in Java library fall in the same two categories. The Good ones throw Exceptions when they detect interruption whereas the Ugly one’s don’t do anything. Blocking methods in the java.net.Socket don’t respond to Interruption. For example, the thread below cannot be stopped by interruption when it is waiting for clients. When a client is connected, accept() returns Socket allowing the caller to check for interruption:

        /**
         * Thread that checks for interruption, but calls a blocking method
         * that doesn't detect Interruptions.
         */
        class InterruptibleShesNot extends Thread {

            @Override
            public void run() {
                while(!Thread.currentThread().isInterrupted()) {
                    try {
                        ServerSocket server = new ServerSocket(8080);
                        Socket client = server.accept(); // This method will not
                                                         // return or 'unblock'
                                                         // until a client connects
                    } catch (IOException ignored) { }
                }

            }

        }

So how do you deal with blocking methods that do not respond to Interruption? You will have to think outside the box and find ways to cancel the operation by other means. For example, Socket operations throw SocketException when the underlying socket is closed (by `Socket.close()`). The code below takes advantage of this fact and closes the underlying socket, forcing all blocking methods such as ServerSocket.accept() to throw SocketException.

package kitchensink;

import java.net.*;
import java.io.*;

/**
 * Demonstrates non-standard thread cancellation.
 *
 * @author umermansoor
 */
public class SocketCancellation {

    /**
     * ServerSocket.accept() doesn't detect or respond to interruption. The
     * class below overrides the interrupt() method to support non-standard
     * cancellation by canceling the underlying ServerSocket forcing the
     * accept() method to throw Exception, on which we act by breaking the while
     * loop.
     *
     * @author umermansoor
     */
    static class CancelleableSocketThread extends Thread {

        private final ServerSocket server;

        public CancelleableSocketThread(int port) throws IOException {
            server = new ServerSocket(port);
        }

        @Override
        public void interrupt() {
            try {
                server.close();
            } catch (IOException ignored) {
            } finally {
                super.interrupt();
            }
        }

        @Override
        public void run() {
            while (true) {
                try {
                    Socket client = server.accept();
                } catch (Exception se) {
                    break;
                }
            }
        }
    }

    /**
     * Main entry point.
     * @param args
     * @throws Exception
     */
    public static void main(String[] args) throws Exception {
        CancelleableSocketThread cst = new CancelleableSocketThread(8080);
        cst.start();
        Thread.sleep(3000);
        cst.interrupt();
    }
}

Summary:

  • Threads cannot be stopped externally; they can only be delivered a signal to stop
  • It is up to the Thread to: i) check the interruption flag regularly, and ii) to act upon it
  • Sometimes checking interruption is not possible if the thread is blocked on a blocking method, such as `BlockingQueue.put()`. Luckily, most blocking methods detect interruption and throw InterruptedException or ClosedByInterruptionException
  • To support blocking methods that do not act on interruptions, non-standard cancellation mechanisms must be used, as illustrated in the last example

Extra:

The thread class also has a method called `interrupted()`. This is what is does: it clears the interrupted status and returns its previous value. Use this method only when you know what you are doing or when you want to clear the interrupt status.

Java’s Iterator, ListIterator and Iterable Explained

Recall Collection: A container for Objects in Java. Example: ArrayList<E>, Vector<E>, Set<E>, etc.

 An iterator is an Object, which enables a Collection<E> to be traversed. It allows developers to retrieve data elements in a Collection without any knowledge of the underlying data structure, whether it is an ArrayList, LinkedList, Set or some other kind.

The concept behind iterator is very simple: An Iterator is always returned by a Collection and has methods such as next() which returns the next element in the Collection, hasNext() which returns a boolean value indicating if there are more elements in the Collection or not, etc.

Iterators promote “loose coupling” between Collection classes and the classes using these Collections. For example, a class containing some kind of an algorithm (e.g. Search) is only concerned with traversing the list without knowing the exact list structure or any low level details.

Interface Iterator<E>:

All Iterator objects must implement this interface and are bound by its protocols. The interface is very simple and only has three methods:

–       next() : Returns the next element in the Collection. To get all elements in a Collection, call this method repeatedly. When the end is reached and there are no more elements present, this method throws NoSuchElementException.

–       hasNext() : returns true if the Collection has more elements. false otherwise.

–       remove() : removes the last returned element from the Collection.

A java.util.Iterator can only move in one direction: forward. Once the iterator has reached the end of the list, it cannot be reset to the starting position again. In this case, a new Iterator should be obtained.

Interface ListIterator<E>:

This is a specialized Iterator for Collections implementing the List<E> interface. In other words, it is an iterator for Lists. It gives several advantages over Iterator namely:

  1. It allows traversing in both directions: forward & backward
  2. It allows for obtaining the Iterator position in the Collection, i.e. its index.
  3. It allows for adding or removing elements of the underlying List. [set(..) works as well].

You must be wondering at this point, why a new kind of Iterator for Lists? Why can’t we use the plain old Iterator. But if you really think about it, you’ll see why: Let us say you have two Collections: a Set<E> and a List<E>. You get Iterators from both Collections to traverse the list. However, you feel that the Iterator returned by List can do more: It can return the current index, allow you to add an element to the list, etc. That’s where ListIterator’s come in. The Iterator returned by Set<E> doesn’t have to do any of this: an element as no position(index) in the Set<E> Collection etc.

Interface Iterable<E>:

Before I wrap this us, I want to discuss the Iterable<E> interface. It is a very simple Interface which defines only one method called iterator() which returns an Iterator<E>.

The sole purpose of this interface is to allow Objects implementing it to be used in for-each loop. A for-each loop in Java looks like the following:

for (String element: (Iterable)collectionImplemetingIterable) { //do something with element}

For example: ArrayList() implements Iterable. This allows you to pass an ArrayList() object to a for-each loop and traverse through it.

e.g.

ArrayList as = new ArrayList();
as.add(“hippo”);
as.add(“chicken”);
as.add(“duck”);

// traverse the List using for-each
for (String element : as)
	System.out.println(element);

Summary:

An iterator allows traversing a Collection. An Iterator could be obtained for virtually any kind of Collection, for example:

ArrayList as = …;
HashSet hs = ….;

Now let us get Iterators for the two Collections defined above:

Iterator asIterator = as.iterator();
Iterator hsIterator = hs.iterator();

To iterate over the two collections:

</pre>
while(asIterator.hasNext) //Iterate over ArrayList()
{
    System.out.println(asIterator.next());
}

while(hsIterator.hasNext) //Iterate over HashSet()
{
    System.out.println(hsIterator.next());
}

Notes:

  • Java iterators are very much like Relational Database Cursors.
  • Prior to Iterators, which were introduced in jdk1.2, programmers used enumeration to traverse Collections.

Skeletal Implementations in Java Explained

I use interfaces generously in my programs. I prefer them over using abstract classes for several reasons, some of which I will mention below:

  1. Inheritance does not promote good encapsulation. All sub classes depend on the implementation details of the super class. This may result in broken sub classes when the super class is changed. (Imagine testing all sub classes every time you change the super class!)
  2. Unlike inheritance, where a sub class can only extend from one super class, classes are free to implement as many interfaces as they like to support.
  3. It is very easy to support a new interface in an existing class. Suppose you would like several of your classes to support a new type, say, Serializable. You can simply implement the interface in your classes and define the interface methods. For example, any class in Java can implement the Comparable interface and can be applied everywhere a Comparable type is expected.
    Note: This is called defining a mixin[1]. Comparable is a mixin type. These types are intended to be used by other classes to provide additional functionality.

The above three arguments go directly against the philosophy of abstract classes. A sub class can only have one parent class and abstract classes defeat the purpose of mixins (Imagine Comparable being an Abstract class).

Now that I have tried my best to convince you Inheritance is bad, let me say this:

“Inheritance has its own place in programming. It is helpful in many cases, and decreases programming effort”

This is best explained with an example. Suppose you are writing a program, which uses Redis to represent its data. You would like to create specialized classes that deal with certain types of data. For instance, a class could be created to open a connection to Redis Database #0 to store running counters and perform all related actions. Another class would connect to Redis Database #1 and store all users in a set who have requested to opt out from the service.

Let us define an Interface representing the main Redis Database:

interface RedisConnection {

    int connect();

    boolean isConnected();

    int disconnect();

    int getDatabaseNumber();
}

Lets write a Counters class which implement this interface:

class RedisCounters implements RedisConnection {

    @Override
    public int connect() {
        //... lots of code to connect to Redis
    }

    @Override
    public boolean isConnected() {
        //... code to check Redis connection
    }

    @Override
    public int disconnect() {
        //... lots of code to disconnect & perform cleanup
    }
 }

Finish by writing a class, which deals with users who have chosen to Opted Out in Redis.

class RedisOptOut implements RedisConnection {

    @Override
    public int connect() {

        //... lots of code to connect to Redis
    }

    @Override
    public boolean isConnected() {
        //... code to check Redis connection
    }

    @Override
    public int disconnect() {
       //... lots of code to disconnect & perform cleanup
    }

    	/**
      * Other code specific to handling users who have opted out
      */

    // method specific to this class
    public boolean isOptedOut(String userid) {….}
}

If you look closely at the two classes above, you’ll notice something is not right: both classes repeat the connect(), isConnected() and disconnect() functions verbatim. This type of code repetition is not good for several obvious reasons: imagine if you have 10 classes instead of just two, and you would like to change the way connect() function works. You’ll have to make edits in all 10 classes and test them.

Abstract Classes To the Rescue

The program in the last section, presents a classic case where abstract classes excel. We can have define an abstract super class which implement common functionality and make its methods final to restrict sub classes from overriding them. You’ll end up with some like the following:

abstract class RedisConnection {
	public final int connect() {
		// ... lots of code to connect to Redis
	}

	public final boolean isConnected() {
		//... code to check Redis connection
	}

	public final int disconnect() {
		// ... lots of code to disconnect from Redis and perform cleanup
	}
}

/**
 *  sub class which extends from RedisConnection
 *
 */
class RedisCounts extends RedisConnection {

	/**
	 * There is no need to define connect(), isConnected() and disconnect() as
	 * these functions are defined by the super class.
	 */

	/**
	 * Other code specific to storing and retreiving counters
	 */
}

/**
 * another sub class extending from RedisConnection
 *
 */
class RedisOptOut extends RedisConnection {
	/**
	 * There is no need to define connect(), isConnected() and disconnect() as
	 * these functions are defined by the super class.
	 */

	/**
	 * Other code specific to handling users who have opted out
	 */
}

No doubt, this is a better solution. But at the beginning of this post, I explained why interfaces are preferred over inheritance. Let us take this one step further and combine interfaces and abstract classes, to maximize the benefits.

Abstract Classes + Interfaces = Abstract Interfaces

We can combine Abstract Classes and Interfaces by providing an abstract class, which defines the basic functionality, with every interface where necessary. The interface defines the type, whereas the abstract class does all the work implementing it.

By convention, these classes are named: AbstractInterface [Interface is the name of the interface the abstract class is implementing]. This convention comes from Java. In the Collections API, the abstract class, which goes with the List interface, is called AbstractList, etc.

The key to designing these abstract classes or AbstractInterfaces is to design them properly and document it well for the programmers. For example, the class comment of the java.util.AbstractList class define the methods the programmers need to override in their implementations:

“To implement an unmodifiable list, the programmer needs only to extend this class and provide implementations for the get(int) and size() methods.
To implement a modifiable list, the programmer must additionally override the set(int, E) method (which otherwise throws an UnsupportedOperationException). If the list is variable-size the programmer must additionally override the add(int, E) and remove(int) methods.”[2]

Abstract Interfaces (Interfaces + Abstract Classes) give programmers the freedom to choose whether they would like to implement the interface directly or extend the abstract class. In our example, we will have:

/**
 * The Interface
 *
 */
interface RedisConnection
{
    int connect();
    boolean isConnected();
    int disconnect();
    int getDatabaseNumber();
}

/**
 * Abstract class which implements the interface.
 * This is called Abstract Interface
 *
 */
abstract class AbstractRedisConnection implements RedisConnection
{
    @Override
    public final int connect()
    {
        //... lots of code to connect to Redis
    }

    @Override
    public final boolean isConnected()
    {
        //... code to check Redis connection
    }

    @Override
    public final int disconnect()
    {
        //... lots of code to disconnect from Redis and perform cleanup
    }
 }

/**
 * A subclass which extends from the Abstract Interface
 *
 */
class RedisOptOut extends AbstractRedisConnection {…}

In cases where a class cannot extend from the AbstractInterface directly, it can still implement the Interface, and use an inner class which extends from the AbstractInterface and forward all interface method invocations to the inner class. For example:

/**
 * A class showing the forwarding technique. This class implements
 * an interface, but forwards all interface method invocations
 * to an abstract class, the Abstract Interface.
 */
class RedisCounters implements RedisConnection {

	// inner class extending Abstract Interface
	private class RedisConnectionForwarder extends AbstractRedisConnection {
		public void RedisConnectionForwarder() {
		}
	}
	RedisConnectionForwarder r = new RedisConnectionForwarder();

	@Override
	public int connect() {
		// Simply forward the request to the Forwarding class.
		return r.connect();

	}

	@Override
	public boolean isConnected() {
		// Simply forward the request to the Forwarding class.
		return r.isConnected();
	}

	@Override
	public int disconnect() {
		// Simply forward the request to the Forwarding class.
		return r.disconnect();
	}

	/**
	 * Other code specific to storing and retreiving **counters**
	 */
}

In cases where a class cannot extend from the AbstractInterface directly, it can still implement the Interface, and use an inner class which extends from the AbstractInterface and forward all interface method invocations to the inner class. For example:

/**
 * A class showing the forwarding technique. This class implements
 * an interface, but forwards all interface method invocations
 * to an abstract class, the Abstract Interface.
 */
class RedisCounters implements RedisConnection {

	// inner class extending Abstract Interface
	private class RedisConnectionForwarder extends AbstractRedisConnection {
		public void RedisConnectionForwarder() {
		}
	}
	RedisConnectionForwarder r = new RedisConnectionForwarder();

	@Override
	public int connect() {
		// Simply forward the request to the Forwarding class.
		return r.connect();

	}

	@Override
	public boolean isConnected() {
		// Simply forward the request to the Forwarding class.
		return r.isConnected();
	}

	@Override
	public int disconnect() {
		// Simply forward the request to the Forwarding class.
		return r.disconnect();
	}

	/**
	 * Other code specific to storing and retreiving **counters**
	 */
}

As a final technique, you can also use static factories returning concrete instances, which they implement, in form of anonymous inner classes.  For example:

/**
 * A static factory method
 */
public static RedisConnection getRedisCountersImpl(…)
{
	return new AbstractRedisConnection() {
		//...
        /**
	 * Other code specific to storing and retrieving counters
	 */

	}
}

Summary

Using Interfaces, as a general contract, has many benefits over Inheritance. Inheritance, however has its own place in programming, and often times is a necessary evil. In this post, we explored Abstract Interfaces which combine the power of Interfaces with Inheritance. Abstract Interface is a term for Abstract Class, which implements all the functionality of an Interface. Abstract Interfaces always go with the Interfaces they are supporting.

Abstract Interfaces gives programmers the freedom to use either the interface or the abstract class, instead of tying them with inheritance down like abstract classes. We explored two techniques of using abstract classes when extending from the Abstract Interface is not possible. The Java API uses abstract Interfaces graciously. The Collections API is filled with these: AbstractList, AbstractSet, AbstractMap, AbstractCollection.

References

[1] http://en.wikipedia.org/wiki/Mixin

[2] http://docs.oracle.com/javase/6/docs/api/java/util/AbstractList.html

Auto generating version & build information in Java

Until recently, I was relying on final Strings and longs to store version information in my programs. However, I soon faced the limitations of this approach such as forgetting to update version (or revision) information, conflicting maven and internal version information. So I switched to using java properties for handling version information and updating the properties at compile time using maven’s antrun plugins. This had its own short comings and resulted in complex pom.xml files.

I have to admit, I’m not a big fan of maven and its XML based structure: I don’t like it because its a gigantic beast. Every time I have to do something in maven, I find that I’m spending time researching online for the right plugin and looking up the documentation for the plugin of interest. As a developer, build management using maven should be the least of my concern {Not having to remember which maven plugin does what}. On the other hand, to be fair to maven, it has some cool features like dependency management, life cycle, convention based directory structure to name a few. But maven tried to do a lot of things, resulting in a complex product.

{At this point, I’m considering switching in Gradle. As I developer, I want to be spending time solving problems in the problem domain not trying to tame my build management system. But using Gradle requires grasp of Groovy (Yet Another Scripting Language – YASL) If only there was a build management system written in Python!!!!!! }

The solution which I’m going to discuss in this post uses Java annotations to generate versioning information at run-time using python scripts. Maven is used in a very limited way with this approach.

Steps

  1. Create Version annotation and a class which reads these annotations in your Java program
  2. Write a python script to write the Java annotation at runtime
  3. Structure your pom.xml file to include generated-sources folder and running our python script

1. Create Version annotation and a class which reads these annotations in your Java program

The first step is to create an annotation holder in your program which you’ll annotate at the runtime. Example here.

Then create a class which is going to read the annotation information. Example here.

2. Write a python script to write the Java annotation at runtime

The next step is to create a python script to generates package-info.java containing build time & date, version string, hostname, etc. The way this works is by creating a package information which is used to provide overall information about package contents. We will fill our annotations in this class. Example of a python script is here. Feel free to use it in your projects.

3. Structure your pom.xml file to include generated-sources folder and running our python script

You then need to tell your maven file to pick up the package-info.java which is auto-generated by the python script in the last step. The python script places the ‘package-info.java’ in “targer/generated-sources/java” folder. I used the build-helper-maven plugin to include a new source folder. Example here.

The last step is to tell maven to run the python script in generate-sources phases. I used the exec-maven plugin for this. Example here.

Checkout the complete project

I have uploaded a complete project on Github: https://github.com/umermansoor/Versionaire

To use the project, do the following:

$ git clone git@github.com:umermansoor/Versionaire.git

$ cd Versionaire

$ mvn package

$ java -cp ./target/versonaire-1.0-SNAPSHOT.jar com._10kloc.versionaire.App