Beneficial CountDownLatch and tricky java deadlock

Have you ever used java.util.concurrent.CountDownLatch ??

It’s a very convenience class to achieve synchronization between two or more threads, where allows one or more threads to wait until a set of operations being performed in other threads completes (check javadoc and this post). CountDownLatch can save your time in suitable cases and you have to be aware of this class.

As always synchronization of threads can raise deadlocks if code is not good. And as concurrency use cases can be very complex, developers must be very careful. I will not describe here a complex concurrency problem, but a simple problem that you may face it if you use CountDownLatch careless.

Assume you have 2 threads (Thread-1 and Thread-2) that share a single java.util.concurrent.ArrayBlockingQueue and you want to synchronize them using a CountDownLatch. Check this simple example:

package gr.qiozas.simple.threads.countdownlatch;

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.CountDownLatch;

public class DeadlockCaseCDL {

    public static void main(String[] args) throws InterruptedException {
		CountDownLatch c = new CountDownLatch(1);
		ArrayBlockingQueue b = new ArrayBlockingQueue(1);

		new Thread(new T1(c, b)).start();
		new Thread(new T2(c, b)).start();
    }

    private static class T1 implements Runnable {
        private CountDownLatch c;
        private ArrayBlockingQueue b;
        private T1(CountDownLatch c, ArrayBlockingQueue b) {
            this.c = c; this.b = b;
        }
        public void run() {
		  try {
		    b.put(234);
		    b.put(654);
		    doWork(T1.class);
		    c.countDown();
		    doWork(T1.class);
		  } catch (InterruptedException ex) {}
	   }
    }

    private static class T2 implements Runnable {
        private CountDownLatch c;
        private ArrayBlockingQueue b;
        private T2(CountDownLatch c, ArrayBlockingQueue b) {
            this.c = c; this.b = b;
        }
        public void run() {
		  try {
		    doWork(T2.class);
		    c.await();
		    doWork(T2.class);
		    System.out.println("I just dequeue "+b.take());
		    System.out.println("I just dequeue "+b.take());
		  } catch (InterruptedException ex) {}
	   }
    }

    private static void doWork(Class classz) {
        System.out.println(classz.getName()+" do the work");
    }
}

What you see above is that main thread creates a CountDownLatch with count “1” and an ArrayBlockingQueue with capacity “1” and afterwards spawns the “2 threads”. The ArrayBlockingQueue will be used for the real “work” (enqueue and dequeue) and the CountDownLatch will be used to synchronize the threads (enqueue must be done before dequeue).

Thread-1 enqueues 2 messages and Thread-2 wants to dequeue them, but only after Thread-1 has enqueued both messages. This synchronization is guaranteed by CountDownLatch. Do you believe that this code is OK??
No, it is not as causes a deadlock!!!

If you see carefully line 10, you will see that I initialize ArrayBlockingQueue with capacity equal to “1”. But Thread-1 wants to enqueue 2 messages and then release the lock (of CountDownLatch), in order to be consumed afterwards by Thread-2. The capacity “1” of queue blocks Thread-1 until another thread dequeue one message from queue, and then tries again to enqueue the 2nd message. Unfortunately, only Thread-2 dequeues messages from queue, but because Thread-1 hold the lock of CountDownLatch, the Thread-2 cannot dequeue any message and so it blocks. So, we really have a deadlock as both threads are blocked (waiting to acquire different locks). Thread-1 waits for ArrayBlockingQueue lock and Thread-2 for CountDownLatch lock (you can see it also in the related Thread Dump below).

If we increase the capacity of the queue then this code will run without problems, but this is not the point of this article. What you have to understand is that CountDownLatch must be used with care, in order to avoid such dangerous cases. You have to know the functional cases of your class, elaborate to other developers of team for this functionality, write useful javadoc and always write code that is robust in extreme cases, not only for happy paths.

Another point that you may help you is that this deadlock is not detected by modern JVMs. Try it.

As you may know, modern JVMs (both Hotspot and JRockit) are able to detect simple deadlocks and report them on Thread Dump. See a simple deadlock example that detected from Hotspot JVM:

Found one Java-level deadlock:
=============================
"Thread-6":
waiting to lock monitor 0x00a891ec (object 0x06c616e0, a java.lang.String),
which is held by "Thread-9"
"Thread-9":
waiting to lock monitor 0x00a8950c (object 0x06c61708, a java.lang.String),
which is held by "Thread-6"

You can try DeadlockCaseCDL and get a Thread Dump (on GNU/Linux run just “kill -3jvm_pid›”). You will see that thread dump looks normal and no deadlock is detected by JVM, but you are on a deadlock!!! So, be aware that this kind of deadlock is not detected by JVM.

Check this Thread Dump example from my local execution:

Full thread dump Java HotSpot(TM) Server VM (17.1-b03 mixed mode):

"DestroyJavaVM" prio=10 tid=0x0946e800 nid=0x5382 waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"Thread-1" prio=10 tid=0x094b1400 nid=0x5393 waiting on condition [0x7c79a000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for   (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
	at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:207)
	at gr.qiozas.simple.threads.countdownlatch.DeadlockCaseCDL$T2.run(DeadlockCaseCDL.java:50)
	at java.lang.Thread.run(Thread.java:662)

"Thread-0" prio=10 tid=0x094afc00 nid=0x5392 waiting on condition [0x7c7eb000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for   (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
	at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:252)
	at gr.qiozas.simple.threads.countdownlatch.DeadlockCaseCDL$T1.run(DeadlockCaseCDL.java:29)
	at java.lang.Thread.run(Thread.java:662)

"Low Memory Detector" daemon prio=10 tid=0x0947f800 nid=0x5390 runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x7cfa9000 nid=0x538f waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x7cfa7000 nid=0x538e waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x7cfa5800 nid=0x538d waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x7cf96000 nid=0x538c in Object.wait() [0x7cd15000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on  (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
	- locked  (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x7cf94800 nid=0x538b in Object.wait() [0x7cd66000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on  (a java.lang.ref.Reference$Lock)
	at java.lang.Object.wait(Object.java:485)
	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
	- locked  (a java.lang.ref.Reference$Lock)

"VM Thread" prio=10 tid=0x7cf92000 nid=0x538a runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x09475c00 nid=0x5383 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x09477000 nid=0x5384 runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x09478800 nid=0x5385 runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x0947a000 nid=0x5387 runnable

"VM Periodic Task Thread" prio=10 tid=0x09489800 nid=0x5391 waiting on condition

JNI global references: 854

Heap
 PSYoungGen      total 14976K, used 1029K [0xa2dd0000, 0xa3e80000, 0xb39d0000)
  eden space 12864K, 8% used [0xa2dd0000,0xa2ed1530,0xa3a60000)
  from space 2112K, 0% used [0xa3c70000,0xa3c70000,0xa3e80000)
  to   space 2112K, 0% used [0xa3a60000,0xa3a60000,0xa3c70000)
 PSOldGen        total 34304K, used 0K [0x815d0000, 0x83750000, 0xa2dd0000)
  object space 34304K, 0% used [0x815d0000,0x815d0000,0x83750000)
 PSPermGen       total 16384K, used 1739K [0x7d5d0000, 0x7e5d0000, 0x815d0000)
  object space 16384K, 10% used [0x7d5d0000,0x7d782e90,0x7e5d0000)

Regards,
Adrianos Dadis.

Democracy requires Free Software

About these ads

About Adrianos Dadis

Adrianos is working as senior software engineer in telcos business domain. Particularly interested in distributed systems, enterprise integration and middleware services. He mainly works with JBoss, Weblogic, Apache Storm, Apache HBase, Tomcat, Java EE, Spring, Drools, Oracle SOA Suite and various ESBs.
This entry was posted in Java, Software Development and tagged , , , . Bookmark the permalink.

9 Responses to Beneficial CountDownLatch and tricky java deadlock

  1. deridex says:

    I think, perhaps, that you are confused about CDL and locking. Please feel free to correct me if I have misunderstood your writing. What your example describes is a logic bug rather than a concurrency bug.

    First, it’s useful to define terms so that everyone approaches the question from the same perspective:

    http://en.wikipedia.org/wiki/Deadlock#Necessary_conditions

    “[O]nly Thread-2 dequeues messages from queue, but because Thread-1 hold the lock of CountDownLatch, the Thread-2 cannot dequeue any message and so it blocks. So, we really have a deadlock as both threads are blocked (waiting to acquire different locks).”

    There are two common resources between T1 and T2: 1) ArrayBlockingQueue and 2) CountDownLatch. But T1 and T2 never acquire a lock on either of these. So, there cannot be a deadlock. The JVM does not detect a deadlock because none exists.

    DeadlockCaseCDL stalls because of a fundamental flaw. The logic of the program is:

    * Set max size of queue = 1.
    * Begin draining queue when size = 2.

    There is no way this program can ever complete.

    PS

    The line numbers in the thread dump do not match the line numbers in the DeadlockCaseCDL.java.

    • Adrianos Dadis says:

      Hi deridex,

      thanks for your comment and your useful approach, as you make it more clear.

      We all know what CountDownLatch is and why you may want to use it. So, the main reason which drove me to write this article is to warn people, by providing an example, about a potential deadlock that may do not have in mind. Even when start writing a program using CountDownLatch or even (more dangerous) when change their code to use CountDownLatch. I hope this article will help someone in planet, but maybe not, that’s life :)

      I know that this program suffers from a fundamental flow and that’s why I have also mentioned the solution and the purpose of the article.
      Please check: “If we increase the capacity of the queue then this code will run without problems, but this is not the point of this article.”

      > First, it’s useful to define terms so that everyone approaches the question from the same perspective.
      I definitely agree on your statement.

      The deadlock case of this article has exactly the same “issues” as the essential “dining philosophers problem” (http://en.wikipedia.org/wiki/Dining_philosophers_problem), which is mainly used to illustrate synchronization issues and also as an example of a deadlock in many computer science books and articles. So, I believe “my” deadlock case agrees with the common definition/term of a deadlock. If you do not agree please elaborate.

      > The JVM does not detect a deadlock because none exists.
      Deadlock exists and JVM does not detect it, as JVM engineers have decided to not detect such deadlock cases. I mention that JVM does not detect this kind of deadlock just as a hint and not as JVM problem. This statement is just extra information, nothing else.

      I will check the mismatch in lines of thread dump, thanks again.

      Regards,
      Adrianos Dadis.

      • deridex says:

        “The deadlock case of this article has exactly the same ‘issues’ as the essential ‘dining philosophers problem'”

        In the dining philosophers problem, each philosopher is attempting to acquire exclusive locks on two forks. Exclusive locks on shared resources are the defining characteristic of a deadlock, but in the article, neither T1 nor T2 acquire locks on anything. T1 and T2, by definition, cannot deadlock.

        The crux of the article boils down to this statement:

        “because Thread-1 hold the lock of CountDownLatch, the Thread-2 cannot dequeue any message and so it blocks”

        This is categorically false because:

        * T1 never aquires a lock on CDL. T1 never holds a lock on anything, actually.
        * CDL does not require locking to work correctly. It is designed to be shared across threads _without_ locking. The api designers knew that writing correct locking algorithms is hard, so they gave us tools like CDL to write lock-free algorithms. Acquiring a lock on CDL is actually wrong because doing so could potentially cause a deadlock between the decrementing thread and the awaiting thread.

        T2 could be doing anything (waiting on a semaphore or even polling a volatile boolean flag) and the program would still fail because the underlying problem has _nothing_ to do with CountDownLatch, which you state yourself:

        “If we increase the capacity of the queue then this code will run without problems”

        Rather than a demonstration of the risks of CDL, this article seems to be a cautionary tale about sizing buffers correctly. If you require N units of something to do useful work, then make sure your buffer is at least N units big. This story is very different from the one suggested by the article title.

        So this:

        “because Thread-1 hold the lock of CountDownLatch, the Thread-2 cannot dequeue any message and so it blocks”

        Should be changed to this:

        “because Thread-1 is filling an undersized buffer, it never counts down, so Thread-2 cannot dequeue any message and blocks”

      • Adrianos Dadis says:

        Hi deridex,

        first of all, I admire your passion for correct definitions and detailed explanations :)
        This is the correct way to make the difference.

        Lets dive in our subject. Is this a deadlock case or something else??

        Definitions:
        1) I will start with the first link you pointed me: http://en.wikipedia.org/wiki/Deadlock#Necessary_conditions
        1.a) The first definition of a deadlock is this: “A deadlock is a situation where in two or more competing actions are each waiting for the other to finish, and thus neither ever does.”. Which in my opinion is correct.
        1.b) Please focus on necessary conditions 2 and 4:
        – “Hold and Wait: processes already holding resources may request new resources held by other processes.”
        – “Circular Wait: two or more processes form a circular chain where each process waits for a resource that the next process in the chain holds.”

        2) Resource starvation case: http://en.wikipedia.org/wiki/Resource_starvation
        You can see that states: “Two or more processes become deadlocked when each of them is doing nothing while waiting for a resource occupied by another program in the same set. On the other hand, a process is in starvation when it is waiting for a resource that simply keeps getting given to other processes.”

        Primitive analysis:
        A) The ArrayBlockingQueue is implemented using a ReentrantLock (‘lock’) and two Condition (‘notEmpty’ & ‘notFull’) objects. Inside T1 class, the second call of ‘put’ (b.put(654);) does two things. First, acquire the lock of ‘lock’ object and afterwards checks if the queue is full and if yes (which is our case), then blocks using the ‘notFull’ Condition, until another thread signal this Condition. So, now T1-Thread waits for a resource to be available!!! The resource/condition now is an empty slot in the queue. T1-Thread cannot insert its object into the queue until the resource/ocndition is available/true, which means queue has at least an empty slot.

        B) The CountDownLatch is implemented using a AbstractQueuedSynchronizer, which consist the lock mechanism of CountDownLatch. Inside T2 class, the first crucial action is ‘c.await();’, which causes T2-Thread to wait (using “LockSupport.park(Object blocker)” method) until the latch has counted down to zero. So, T2-Thread waits for a resource/condition to be available/true!!! The resource now is within the AbstractQueuedSynchronizer that is parked and waits for a “signal” in order to unblock and continue its execution.

        Conclusion:
        In (A) we need mutual exclusion on an empty queue slot, in order to write. Only one can insert an object on this empty slot, and all the other tries (by the same or different threads) have to wait, until the same or another slot is empty. Do not forget that our queue has only one slot. Given (A) and (B) I can say that this case is a resource starvation case, but a deadlock case too. It is a deadlock, because of deadlock definition (1.a) and (2), which is commonly used in most areas. It is also a resource starvation case, as T1-Thread cannot continue (enqueue one more message) until another thread dequeue a message from the queue.

        In the dining philosophers problem we have deadlock AND resource starvation, but you are right that is not “exactly the same” problem as I stated.

        Best regards,
        Adrianos Dadis.

  2. deridex says:

    “The first definition of a deadlock is this: ‘A deadlock is a situation where in two or more competing actions are each waiting for the other to finish, and thus neither ever does.’. Which in my opinion is correct.”

    That part of the definition does, indeed, apply to the article’s example, but that part also applies to general resource starvation. As this discussion shows, deadlocks are too complicated to be explained in a single line. Two or three sentences later, the wiki elaborates by introducing resource locking. Resource locking is the “lock” part of the word “deadlock”. Of the four items in the “Necessary Conditions” section, you pointed out #2 and #4, so let’s talk about those:

    “2. Hold and Wait: processes already holding resources may request new resources held by other processes.”

    When the application stalls, who is holding an exclusive lock (item #1) on which resource? Thread-1 once held a lock via ArrayBlockingQueue but it no longer does when it stalls. See http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Condition.html#await%28%29 . For the impatient, here is the relevant excerpt (though you really should read the full javadoc):

    “The lock associated with this Condition is atomically released”

    So, thread-1 is not exclusively holding any resources when the application freezes. Thread-2 is also empty handed. Thread-1 and Thread-2 have references to shared resources, but neither is holding any exclusive locks anywhere in the call stack when the application stalls. Something else (not locking) is causing the application to stall.

    “4. Circular Wait: two or more processes form a circular chain where each process waits for a resource that the next process in the chain holds.”

    Given that no one is exclusively holding anything, item #4 does not apply.

    We definitely agree the example deliberately induces a form of resource starvation, but it can’t be a deadlock if no one is holding any locks when the application freezes. I guess we have to agree to disagree on this point.

    I’ll end with this thought. I think it’s telling that you can fix the application _without_ changing the lock algorithm. To fix a deadlock, you usually have to fix the bug in your locking algo (e.g. lock things in a specific order rather than random order), but we didn’t have to do that to fix this application.

    Actually one more thought. There has been a lot of discussion around “Thread-1″ vs “Thread-2″ and “instance T1″ vs “instance T2″. Whenever I saw “Thread-1″, my mind thought “instance T1″. Instance T1 is running in Thread-1, and T1 calls many methods on many other classes. Likewise for T2. I apologize to anyone that was confused by my previous comments. I tried to be clearer in this one.

  3. James says:

    As Deridex says this has nothing to do with CountDownLatch or deadlocking.

    Thread 1 (T1) gets stuck at line “b.put(654);” making the rest of the code in the example superfluous.

    The example simply demonstrates how an ArrayBlockingQueue blocks when it is full. That’s it.

  4. Adrianos Dadis says:

    @Deridex:
    If we want to give a strict definition of deadlock, then my title about deadlock is a bit confusing, but this was not my point. What I mean as deadlock, is a condition where a process (or thread) blocks another process to run. This is a more general definition about deadlock and I agree that if we want to be accurate then this situation is better defined as resource starvation.

    @James:
    First of all thanks that you remind me to clear my position about “resource starvation” and “deadlock” with Deridex.
    In my example I use ArrayBlockingQueue just as a helper object to demonstrate how a CountDownLatch can bring your code in a problematic situation. Please think that if we clear code from CountDownLatch, then the program will finish without any problem. The problem arises when we misuse CountDownLatch along with another “resource” (in my example the ArrayBlockingQueue), where if code is not well written then you may have a problem (as in my example), which in this case it can lead to a resource starvation state.

  5. I enjoyed reading both the article and the comments. Here is my take on this:

    The demo program guarantees a deadlock. It does not necessarily demo the risk of using CDLs. So i could in theory swap the CDL with a bunch of other concurrency objects and achieve the same guaranteed deadlock.

    What would be really interesting to demo is a case where CDL causes deadlocks that are specific to CDL’s case and also where it is not a guaranteed deadlock, but one that is at risk of happening under certain circumstances. Why I say that is because its very easy to simulate a guaranteed deadlock and it is very easy to avoid the guaranteed deadlocks. It becomes much harder to avoid race condition deadlocks.

    • Adrianos Dadis says:

      Hi Karim, thanks for you clear comment.
      I fully agree that demo does not really exposure a problem of CDL, but a trivial problem that could also happen using other concurrency objects.
      What triggered me to wrote this article were the features of CDL (that did not exist in older JDK versions). Someone could be suprised with these concurrency feautures of CDL and using it everywhere. I was just to ring the bells that there are many cases that even using CDL your code could produce a deadlock.

      Regards,
      Adrianos

Post your thought

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s