Concurrency II
(See also SAV07 Lecture 18.)
Concurrency aspects of Java
- Ways to create (“spawn” or “fork”) threads
- locks, global and per-object, relationship to atomicity
Ways to create ("spawn" or "fork") threads
There are two ways to create threads in Java: subclassing the Thread class or using the Runnable interface:
- The Thread Class
The Thread is an encapsulation of the thread construction and gives a way of communicating with the thread as an entity. The Thread class is used to spawn the thread and can be used to start or pause the thread.
The Thread class has three main methods:
1. start() : This method is called by the user of this class to spawn the new thread. start() executes the run() method as the new thread. 2. run() : This is the code that is to be run in the separate thread. This method can do whatever any normal method can do, such as call methods of other objects, instantiate objects, etc. 3. sleep([milliseconds]) : Causes the thread to pause for the specified number of milliseconds.
To spawn a new thread using the Thread class:
1. Subclass the Thread class. 2. Override the run() method. 3. Instantiate the Thread object. 4. Call its start() method at the point where you want to spawn the new thread.
- The Runnable Interface
The Runnable interface specifies only one method, which takes no parameters:
public void run();
This method effectively replaces the run() method of the Thread class and frees the programmer from the constraints of having to subclass Thread.
A Thread object must still be made, as its start() method is still needed. To spawn a thread using the Runnable interface:
1. Have the desired class implement the Runnable interface. 2. Implement the run() method. 3. Where needed, instantiate a Thread object using the Thread(Runnable target) or one of the other constructors that takes a Runnable target. 4. Call the Thread object's start() method at the point where you want to spawn the new thread.
Synchronizing threads
Synchronized statement: When this statement is used in front of a method declaration, it prevents multiple threads from simultaneously executing the method. If one thread is executing the method, then all other threads are “locked” out and wait for the first thread to finish before they get their turn.
A thread may read the value of a variable or change its value. If two or more concurrent threads act on a shared variable, there is a possibility that the actions on the variable will produce timing-dependent results. This dependence on timing is inherent in concurrent programming, producing one of the few places in the language where the result of executing a program is not determined solely by this specification.
Each thread has a working memory, in which it may keep copies of the values of variables from the main memory that is shared between all threads. To access a shared variable, a thread usually first obtains a lock and flushes its working memory. This guarantees that shared values will thereafter be loaded from the shared main memory to the threads working memory. When a thread unlocks a lock it guarantees the values it holds in its working memory will be written back to the main memory.
Reference
Java thread class documentation
Examples
Chord
Chord is a static race detection tool for Java to find serious and previously unknown concurrency bugs in programs.
Before: Increasing clock speeds ⇒ even sequential programs ran faster
Now: Increasing #CPU cores ⇒ only concurrent programs will run faster
Race Conditions
Data race occurs when same location is accessed simultaneously without common lock held by more than one thread where at least one thread is writing.
- Triggered non-deterministically
- No fail-stop behavior even in safe languages like Java
- Lies at heart of many concurrency problems: atomicity checking, deadlock detection, …
Race Detection Approach in Chord
- aliasing pairs → same location accessed
- shared pairs → same location accessed by different threads
- parallel pairs → same location accessed by different threads simultaneously
- unlocked pairs → same location accessed by different threads simultaneously without common lock held
Example
Field f is race-free if:
- e1 and e2 never refer to the same value → May Alias Analysis
- l1 and l2 always refer to the same value → Must Alias Analysis
- Whenever l1 and l2 refer to different values, e1 and e2 also refer to different values → Conditional Must Not Alias Analysis
Architecture of Chord
Analysis Steps
- Original pairs: Access pairs with at least one write to same field
- Reachable pairs: Both are reachable from some thread
- Aliasing pairs: Both can access the same memory location (k-object sensitive alias analysis + bddbddb)
- Escaping Pairs: The memory location is thread-shared (thread escape analysis)
- Unlocked Pairs: Pairs where the memory location is not guarded by a common lock in both accesses (approximation of may-alias analysis, which is unsound)
- Annotations: Permit to rule out false alarms
Unsoundness Summary
- Lacking must-alias analysis, the last stage of the algorithm uses may-alias analysis to check whether a pair of accesses is guarded by a common lock, which is unsound.
- Open programs are not analyzed soundly: missing callee methods are treated as no-ops while missing caller methods are modeled by an automatically synthesized harness that simulates many but not all usage scenarios.
- Not checking races on accesses in class initializers, constructors, and finalizers which typically lack synchronization and seldom contain races but cause many false alarms without a method-escape analysis.
- Ignoring the effects of reflection and dynamic class loading.
Extentions presented in popl07 paper
- Conditional must not aliasing, a new aliasing property and analysis. → Distinct blocks should be protected by distinct locks.
- Disjoint reachability, a new lightweight shape property and analysis that can be used to compute conditional must not aliasing facts.
- A new race detection algorithm based on conditional must not aliasing that is sound(no annotations) and effective in practice.
Tools used:
Goldilocks
Introduction
Goldilocks is a lockset-based algorithm for precisely computing the happens-before relation and detecting data-races at runtime(dynamic race detection).
Dynamic race detection
1. Vector clocks * precisely compute the happens-before relation * have significantly more overhead 2. Locksets * imprecise -> may generate false race warnings * efficient
- Goldilocks:
- precise
- efficient (purely lockset based)
- Happens-before relation: Data race occurs between two accesses to a shared variable if they are not ordered by the happens-before relation.
- A concurrent execution δ is represented by a finite sequence
- ti is a thread
- si is a program state
- αi is one of the following actions: acq(), rel (), read(), write(), fork (), join(), and alloc()
The happens-before relation hb for δ is the smallest transitively-closed relation on the set {1,2,…,n} such that for any k and l, we have k hb l if 1 < = k < = l < = n and one of the following holds:
- tk = tl
- αk = rel(o) and αl = acq(o)
- αk = write(o) and αl = read(o)
- αk = fork(tl)
- αl = join(tk)
- Goldilocks uses the happens-before relation to define data-race free executions.
- The execution δ is race-free on data variable (o; d) if for all k; l such that αk, αl are elements of {read(o; d); write(o; d)}, we have k hb l or l hb k.
Extended version of Goldilocks
- Described in Goldilocks: a race and transaction-aware java runtime
- Does Static Analysis before Dynamic Analysis → much faster
- Throws a DataRaceException when a data race is about to occur
- Applicable to both locks and transactions