Concurrency II

Concurrency aspects of Java

Ways to create (“spawn” or “fork”) threads
locks, global and per-object, relationship to atomicity

Ways to create ("spawn" or "fork") threads

There are two ways to create threads in Java: subclassing the Thread class or using the Runnable interface:

The Thread Class

The Thread is an encapsulation of the thread construction and gives a way of communicating with the thread as an entity. The Thread class is used to spawn the thread and can be used to start or pause the thread.

The Thread class has three main methods:

 1. start() : This method is called by the user of this class to spawn the new thread. 
    start() executes the run() method as the new thread.
 2. run() : This is the code that is to be run in the separate thread. This method can
    do whatever any normal method can do, such as call methods of other objects, 
    instantiate objects, etc.
 3. sleep([milliseconds]) : Causes the thread to pause for the specified number of 
    milliseconds.

To spawn a new thread using the Thread class:

 1. Subclass the Thread class.
 2. Override the run() method.
 3. Instantiate the Thread object.
 4. Call its start() method at the point where you want to spawn the new thread.

The Runnable Interface

The Runnable interface specifies only one method, which takes no parameters:

  public void run();

This method effectively replaces the run() method of the Thread class and frees the programmer from the constraints of having to subclass Thread.

A Thread object must still be made, as its start() method is still needed. To spawn a thread using the Runnable interface:

 1. Have the desired class implement the Runnable interface.
 2. Implement the run() method.
 3. Where needed, instantiate a Thread object using the Thread(Runnable target) or 
    one of the other constructors that takes a Runnable target.
 4. Call the Thread object's start() method at the point where you want to spawn 
    the new thread.

Synchronizing threads

Synchronized statement: When this statement is used in front of a method declaration, it prevents multiple threads from simultaneously executing the method. If one thread is executing the method, then all other threads are “locked” out and wait for the first thread to finish before they get their turn.

A thread may read the value of a variable or change its value. If two or more concurrent threads act on a shared variable, there is a possibility that the actions on the variable will produce timing-dependent results. This dependence on timing is inherent in concurrent programming, producing one of the few places in the language where the result of executing a program is not determined solely by this specification.

Each thread has a working memory, in which it may keep copies of the values of variables from the main memory that is shared between all threads. To access a shared variable, a thread usually first obtains a lock and flushes its working memory. This guarantees that shared values will thereafter be loaded from the shared main memory to the threads working memory. When a thread unlocks a lock it guarantees the values it holds in its working memory will be written back to the main memory.

Reference

Java thread class documentation

Examples

Series of examples with concurrency

Chord

Chord is a static race detection tool for Java to find serious and previously unknown concurrency bugs in programs.

Before: Increasing clock speeds ⇒ even sequential programs ran faster

Now: Increasing #CPU cores ⇒ only concurrent programs will run faster

Race Conditions

Data race occurs when same location is accessed simultaneously without common lock held by more than one thread where at least one thread is writing.

Triggered non-deterministically
No fail-stop behavior even in safe languages like Java
Lies at heart of many concurrency problems: atomicity checking, deadlock detection, …

Race Detection Approach in Chord

aliasing pairs → same location accessed
shared pairs → same location accessed by different threads
parallel pairs → same location accessed by different threads simultaneously
unlocked pairs → same location accessed by different threads simultaneously without common lock held

Example

Field f is race-free if:

e1 and e2 never refer to the same value → May Alias Analysis
l1 and l2 always refer to the same value → Must Alias Analysis
Whenever l1 and l2 refer to different values, e1 and e2 also refer to different values → Conditional Must Not Alias Analysis

Architecture of Chord

Analysis Steps

Original pairs: Access pairs with at least one write to same field
Reachable pairs: Both are reachable from some thread
Aliasing pairs: Both can access the same memory location (k-object sensitive alias analysis + bddbddb)
Escaping Pairs: The memory location is thread-shared (thread escape analysis)
Unlocked Pairs: Pairs where the memory location is not guarded by a common lock in both accesses (approximation of may-alias analysis, which is unsound)

Annotations: Permit to rule out false alarms

Unsoundness Summary

Lacking must-alias analysis, the last stage of the algorithm uses may-alias analysis to check whether a pair of accesses is guarded by a common lock, which is unsound.
Open programs are not analyzed soundly: missing callee methods are treated as no-ops while missing caller methods are modeled by an automatically synthesized harness that simulates many but not all usage scenarios.
Not checking races on accesses in class initializers, constructors, and finalizers which typically lack synchronization and seldom contain races but cause many false alarms without a method-escape analysis.
Ignoring the effects of reflection and dynamic class loading.

Extentions presented in popl07 paper

Conditional must not aliasing, a new aliasing property and analysis. → Distinct blocks should be protected by distinct locks.
Disjoint reachability, a new lightweight shape property and analysis that can be used to compute conditional must not aliasing facts.
A new race detection algorithm based on conditional must not aliasing that is sound(no annotations) and effective in practice.

Chord notes

Tools used:

Goldilocks

Introduction

Goldilocks is a lockset-based algorithm for precisely computing the happens-before relation and detecting data-races at runtime(dynamic race detection).

Dynamic race detection

  1. Vector clocks
     * precisely compute the happens-before relation
     * have significantly more overhead
  2. Locksets
     * imprecise -> may generate false race warnings
     * efficient

Goldilocks:

precise
efficient (purely lockset based)

Happens-before relation: Data race occurs between two accesses to a shared variable if they are not ordered by the happens-before relation.

A concurrent execution δ is represented by a finite sequence

t_i is a thread
s_i is a program state
α_i is one of the following actions: acq(), rel (), read(), write(), fork (), join(), and alloc()

The happens-before relation hb for δ is the smallest transitively-closed relation on the set {1,2,…,n} such that for any k and l, we have k hb l if 1 < = k < = l < = n and one of the following holds:

t_k = t_l
α_k = rel(o) and α_l = acq(o)
α_k = write(o) and α_l = read(o)
α_k = fork(t_l)
α_l = join(t_k)

Goldilocks uses the happens-before relation to define data-race free executions.
The execution δ is race-free on data variable (o; d) if for all k; l such that α_k, α_l are elements of {read(o; d); write(o; d)}, we have k hb l or l hb k.

LARA