Is "=" an atomic operation?
I have a multithreaded app, where an object is shared between two threads.It has a single data member, state, which has a simple getter and a simple setter:
publicclass Foo{
private FooState myState;
synchronizedpublic FooState getState(){
return myState;
}
synchronizedpublicvoid setState(FooState aState){
myState = aState;
}
}
My question is, do I really *need* to make getState() and setState() synchronized in this case? I don't see it adding any value here, unless somehow the "=" operation in the setter could be interrupted midstream, leaving myState in some kind of bizzare state which could fowl up an interim call to getState() by another thread.
So to summarize, do you know if "=" can be considered atomic? That is, can I assume that at any given time myState will either equals it's old value, or it's new value, but nothing "in between"?
I know I know... better safe than sorry. I'm asking more out of curiosity than concern for performance.
I'm not a big thread guru, but it looks to me like those methods wouldn't require synchronization; they are indeed atomic.
I don't think you have to synchronize a read operation, so you don't need the keyword on the get.
As far as the set goes, I believe assignment of a 32-bit reference is atomic. The following is from Allan Holub http://www.javaworld.com/javaworld/jw-09-1998/jw-09-threads_p.html
The first OS-level concept that's important to understand is atomicity. An atomic operation cannot be interrupted by another thread. Java does define at least a few atomic operations. In particular, assignment to variables of any type except long or double is atomic. You don't have to worry about a thread preempting a method in the middle of the assignment. In practice, this means that you never have to synchronize a method that does nothing but return the value of (or assign a value to) a boolean or int instance variable. Similarly, a method that did a lot of computation using only local variables and arguments, and which assigned the results of that computation to an instance variable as the last thing it did, would not have to be synchronized.
MOD
Assignment is atomic for all types except long and double. However without proper synchronization you may run into two problems.
One is that thread A calling getState() might never see the result of thread B calling setState(). This can be mediated to an extent by declaring myState volatile.
The other is that thread A might get a reference to a FooState set by thread B before the object is fully initialised. For all the scary scenarios see this:
http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html
Synchronization is generally cheap. It is also generally necessary.
Assignment is atomic for values <= 32 bits, in other words, for everything other than float and double. You don't need to synchronize those methods.
However, if you have a method that does both a read and a write, then you must synchronize. public void set(int val) { // sync me if add the inc method
val_ = val;
}
public int get() {
return val_;
}
// If I add the following method, then this method and set BOTH must be synced
// get is still safe as unsynced.
public void inc() {
val_++:
}
If val_ were a long or double, then get and set would need to be synced, even if inc were not present.
In theory, if there were just get and set of a long or double (no inc) then declaring that double/long as volatile would alleviate the need for syncing. However, due to some apparent looseness in the JMM and/or improper implementation of volatile by some VMs, you're apparently not supposed to rely on volatile to do all that it's supposed to. Look at the Bill Pugh volatile stuff in the middle of [url=http://www.cs.umd.edu/~pugh/java/memoryModel/]this page[/url]
jverd at 2007-7-1 12:49:35 >

The myState variable is atomic. It's only 64 bits variables like double and long that can be read in the middle of a write (you solve it by declaring them volatile). To make absolutely certain you could declare myState volitile but it shouldn't be necessary.
> this means that you never
> have to synchronize a method that does nothing but
> return the value of (or assign a value to) a boolean
> or int instance variable.
Not true. A set could interrupt, say, an increment method, s.t. inc reads 5, set writes 11, inc writes 6. What you wanted was either it was set to 11, then got inc'ed to 12, or it was 5 got inc'ed to 6, then got set to 11.
legosa: good point about threads potentially never seeing each others' writes if you don't sync.
jverd at 2007-7-1 12:49:35 >

Well, I didn't know about those problems with volatile, but it shouldn't matter because a reference is atomic and a process cannot read a "scrambled" value.
> Well, I didn't know about those problems with
> volatile, but it shouldn't matter because a reference
> is atomic and a process cannot read a "scrambled"
> value.
For the OP's question, you don't need volatile for atomicity, true. However, another use of volatile is to force threads changes' to be immediately visible to each other. I haven't read enough of that JMM stuff yet to know if that aspect of volatile is broken or if it's the atomicity part of it, or both.
jverd at 2007-7-1 12:49:35 >

This code may never fail, but it is also not safe. The = is atomic in that it should never retrieve a pointer to an object that it was never assigned, it will always be a 'valid' pointer, but at the same time you shouldn't write code like that.
But even just syncing during the get and set may fail. Because it's apparent that you're not syncing on the shared object in this code you've given us.
The object that is shared in this case is not Foo, but rather FooState. So you should write your code like this.
//in a class where you want to retrieve and use FooState
Foo f ;
synchronized ( FooInstance ) {
f = FooInstance.getState();
}
if ( f != null ) {
synchronized ( f ) {
// use f
}
} else {
//foostate was null, handle errors
}
}
This assumes that FooState is mutable, if FooState is immutable, then you don't need to sync on it.
What you have to watch out for is that all state objects are deep clones. Two state objects mustn't share a common reference to some third (mutable) object. This is to ensure that the state really is a "snapshot" of the situation at one specific point in time.
> This code may never fail, but it is also not safe. The = is atomic in that it should never retrieve a
> pointer to an object that it was never assigned, it will always be a 'valid' pointer
As I mentioned above, you could retrieve a valid pointer to an invalid (uninitialized) object. This is due to possible reordering of memory writes.
> Because it's apparent that you're not syncing on the shared object in this code you've given us. The
> object that is shared in this case is not Foo, but rather FooState.
Surely if Foo weren't shared between two threads they couldn't both call methods on it? :-)
> > This code may never fail, but it is also not safe.
> The = is atomic in that it should never retrieve a
> > pointer to an object that it was never assigned, it
> will always be a 'valid' pointer
>
> As I mentioned above, you could retrieve a valid
> pointer to an invalid (uninitialized) object. This is
> due to possible reordering of memory writes.
>
True, the pointer is valid (meaning that the address the pointer resolves to will contain the object), the object however is 'not yet' :) and fully capable of crashing the VM. Aggressive inlining and memory speculation are yet another reason that we need to sync all the time.
> This code may never fail, but it is also not safe.
> The = is atomic in that it should never retrieve a
> pointer to an object that it was never assigned, it
> will always be a 'valid' pointer, but at the same
> time you shouldn't write code like that.
>
I remember arguing in a forum for a couple 100 posts about this. These two believed weak memory model architectures coudl allow a invalid pointer to be returned. I gave up.
indeed = is atomic wirh respect to references, and references are guaranteed to be valid by the JLS. Still one should not do this because without the synch statement a thread has no requirement to reload the object and may hold onto his cached value.
Also steveftoth, you are incorrect wrt/ the object you are synching on. It makes no difference which lock you use, as long as both access points use the same lock. So either object is fine. However, best practice would create an object exclusively for the lock. Your code is a step above his code. I guess what matters is that you have a system and stick with it. Using lock objects is the only way to preserve encapsulation.
For this situation however, volatile is the best answer. That is if volatile is working these days.
public class Foo {
private volatile FooState myState;
public FooState getState() {
return myState;
}
public void setState(FooState aState) {
myState = aState;
}
}
legosa, I think you post summed it up and cought some points i missed. So I recind my post.
steveftoth, I think I understand how you are addressing this issue, and I think you too are correct.
Wow, lots of great comments, thanks to all!
So, what I've gotton from this is that, while in principle I'm correct that the assignment in setState() is atomic, and cannot be interrupted by another thread's call to getState(), the fact that I did not declare myState as volatile means that a reader thread A making a call to getState() may never see the results of a prior call to setState() by thread B. It also sounds like volatile isn't always worthy of our trust, while synchronized is.
Just to clarify - instances of the FooState are in fact immutable.