Untitled Document

The authors mention CSP as a kind of idiom that makes their lives harder. In fact, programs based on CSP-style primitives should be easier to analyze, since they use less "stateful" techniques. I wonder how much easier it would be to produce a race detector for Erlang or other languages that use this style.
Near the end of the paper, there are laundry lists of programming idioms that lead to false positives. There's no unifiying principle behind these, and the authors' approach seems to be to add heuristics until they get acceptable results. It sure would be nice to do this in a more principled way, either through programmer annotations or some kind of static analysis.
One unfortunate consequence of the evaluation section of this paper is the lack of an optimal to compare to. It would be nice to know how good coverage is of the test tool. Also, in terms of a dynamic approach, they have no way of testing execution paths which are non-typical, and race conditions typically are a problem for atypical situations.
Also, the results section is not that promising in terms of what was gained by the more complex analysis. We see that except for the case of resin, only a handful (less than 10) reduction of detected race situations between the simple analysis and the detailed one.
The second method, which is incredibly expensive, computes Lamport's happens before relationships for the program, and looks for potential races. I get the feeling that, in database terms, this is similar to confirming that the set of accesses the program makes forms a serializable schedule, although the treatment of data structures (which are analogous to indexes) seems to be pretty ad-hoc. Lock acquisitions and releases are used in lieu of transactional boundaries.
The interesting part of this paper to me is their approach that relies on the invariant that happens-before relations are a subset of lockset relations. Therefore, they can avoid the expensive happens-before analysis in many cases where the cheaper lockset algorithm can show there is no data race.

The authors suggest that static optimizations could be used to optimize away some of the instrumentation. Static methods could also be used to improve the coverage of the detection system, by "exercising" paths that may not be executed normally.

The biggest weakness I see with this paper is that it relies upon heuristics to verify locking models. This is a result of the dynamic approach, since without knowledge of the program's source code the algorithm must make do with incomplete information. However, this is the key to the algorithm's scalability, and it seems likely that this algorithm would have enough information to catch most bugs if used in conjunction with an appropriate testing environment.
I found the evaluation inconclusive. While the staged (lockset-then-hybrid) approach appears to have tractable overheads, I couldn't figure out how many fewer false positives it had than a straightforward lockset approach! (Which is basically the whole point of this paper.) For instance, the "Simple" column in Table 3 should be annotated with classifications for comparison but it is not.
Unfortunately, the major flaw in the paper was that they assumed that unsynchronized memory writes become visible in causal order. This is most emphatically NOT true on almost all high-performance SMP machines with more than 4 CPUs, and in fact this exact problem drove the rearchitecting of the Java Memory Model [1].
Extending the intrumentation to the VM level as opposed to the current byte-code level will potentially allow detections of races in native code or accesses in reflection and could improve performance.
It seems to me that, (in section 2.1) the number executions to be analyzer in the lockset-based approach could be exponentially large, because the algorithm is basically trying all possible interleavings of existing threads. As such, I don't think a "fully dynamic" analysis is practical. Perhaps we should combine static analysis (as much as we can) with dynamic analysis? There are many situations where race conditions can be statically detected by analyzing the structure of the program. It would be more efficient to do a static analysis, and run the dynamic test only if no error is found in the static phase.
It also is not clear whether a more draconian, but sound, flow- and context-sensitive static approach would produce more false positives than the approach described here. It could simply check lock status at each line of the source code. Such an approach would probably handle fewer exotic synchronization schemes unless it was extended to handle primitives that do not fit the lock/unlock pattern well.