bugs in DRDT tutorial

Section 6.2.1 of the tutorial says:

18 if (!pflag[i])

19 continue;

20 if (v % i == 0) {

21 pflag[v] = 0;

22 return 0;

23 }

The Data-Race Detection Tool reports that there is a data-race between the Write to pflag[] on line 21 and the Read of pflag[] on line 18. However, this data-race is benign as it does not affect the correctness of the final result. At line 18, a thread checks whether pflag[i], for a given value of i is equal to 0. If pflag[i] is equal to 0, then the thread continues on to the next value of i. If pflag[] is not equal to 0 and v is divisible by i, then the thread writes the value 0 to pflag[i]. It does not matter if, from a correctness point of view, multiple threads check the same pflag[i] and write to it concurrently, since the only value that is written to pflag[i] is 0.

Looking closely at the text, the reference to pflag[] should be pflag[i], and the last three references to pflag[i] should really be to pflag[v]. No, wait, that's not right, either. pflag[i] is not both checked and written. pflag[i] is checked, and pflag[v] is written. The paragraph itself needs to be re-written (perhaps it was edited concurrently by multiple authors? :-) ).

Also, Section 6.2 says there are two examples below, when in fact there are three.

Also, Section 6.2.2 says:

20 volatile int is_bad = 0;

106int i;

107for (i=my_start(thread_id); i<my_start(thread_id); i++) {

108 if (is_bad)

109 return;

110 else {

111 if (is_bad_element(data_array[i])) {

112is_bad = 1;

113return;

114 }

115 }

116}

There is a data-race between the Read of is_bad on line 108 and the Write of is_bad on line 112. However, the data-race does not affect the correctness of the final result.

But no, that's not really why there's no bad data race. The real reason is because the loop condition will never cause the body of the loop to be executed, if (as one would expect) mystart() always returns the same result given the same parameter. Also, once that is fixed, shouldn't you be mentioning that the apparent benign character of this race depends on the difference between the two values (0 and 1) being only one bit, and therefore one need not worry about the atomicity of the write? That is, if you had a 32-bit integer, and the initial value had the low bit set and the final value had the high bit set (with the value test correspondingly adjusted), and if the machine architecture allowed a 32-bit integer to be written at the hardware level in two 16-bit chunks, you'd still have to worry about a race condition potentially making invalid values appear in the shared is_bad variable.

Also, I find it rather astounding that the third example is of double-checked locking, especially without any mention of the history of and problems with this idea (http://en.wikipedia.org/wiki/Double-checked_locking and http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf).

Finally, in the usage flow diagram in Section 5.4, under "L1: Perform a data-race detection experiment:" you should mention the use of processor sets: http://developers.sun.com/solaris/articles/solaris_processor.html>

[3265 byte] By [herteg] at [2007-11-26 10:12:03]
# 1
Herteg,Thank you very much for your detail review and good suggestions.The upper bound of the loop in section 6.2.2 should be my_end(thread_id).We will update the document.Thanks!-- Yuan
yuan at 2007-7-7 1:59:40 > top of Java-index,Development Tools,Solaris and Linux Development Tools...