casting int to a byte -- retreiving the lowest 8 bits
Yo, thought i'd try to open up some debate on this topic. I wrote a data feed that needed to grab the lower 8 bits out of an int, so i wrote what i believed to be the standard way of doing this, which was effectively
publicstaticbyte intToByte(int c ){
return (byte) (c & 0xff);
}
A colleague looking through my code asked whether the & 0xff was necessary, and did i believe that there was any value of c for which it made a difference, as he believed that this was equivalent to just
return (byte)c;
My immediate thought was worrying about byte in java being signed, and having data screwed up, so i ran some tests.. and on every value i tried (byte)c is indeed equivalent to (byte)( c & 0xff ).
My argument was to be that (byte)( c & 0xFF ); is great code to read as a maintainer, because it;'s immediately obvious that you are strictly interested in the lowest 8 bits of the int, and nothing else is of importance, and a simple (byte)c; can look naive and make every developer looking at the code for the first time think it's incorrect.
However, i knew his comeback would be that the datafeed has an overriding need for speed, so i ran some tests comparing the repeated operation of (byte)c to (byte)(c & 0xff ) over a range of 100,000 numbers (test repeated several times to obviate startup times). It turned out that doing the & 0xff added about 30% to execution time on my machine (java 1.5 on WinXP). That's quite a severe penalty for a very common operation! I think i'm going to change the code to cast straight to a byte and leave a big comment beforehand explaining how it's equivalent to (byte)(c & 0xff );
This got me wondering how it was implemented in the core java libraries though, since OutputStream has a method to write a byte that actually takes an int parameter. How does this work? Most of the lowest level OutputStream implementations seem to end up going to native to do this (understandably), so i dug out ByteArrayOutputStream. This class does optimise away the & 0xFF and is roughly
publicsynchronizedvoid write(int b){
?buf[count] = (byte)b;
?}
No problems with that, so writing to these babies will be fast. But then i started wondering about the methods of DataOutputStream, which is heavily used by use for serialising (a great deal of) internal data flow. Unfortunately in this class there are a lot of redundant & 0xFFs:
publicfinalvoid writeShort(int v)throws IOException{
out.write((v >>> 8) & 0xFF);
out.write((v >>> 0) & 0xFF);
incCount(2);
}
publicfinalvoid writeInt(int v)throws IOException{
out.write((v >>> 24) & 0xFF);
out.write((v >>> 16) & 0xFF);
out.write((v >>> 8) & 0xFF);
out.write((v >>> 0) & 0xFF);
incCount(4);
}
[The v >>> 0 seems to be optimised out at runtime and i get no execution time difference between ( v >>> 0) & 0xff that and ( v & 0xff ) so i got no problems with that]
which again seems ok on inspection because the code looks tidy and clean and easy to understand, but i need to hit these things very heavily so would rather they were 30% faster than easy to read. Interestingly they've taken an entirely different approach for writing out a long value:
publicfinalvoid writeLong(long v)throws IOException{
writeBuffer[0] = (byte)(v >>> 56);
writeBuffer[1] = (byte)(v >>> 48);
writeBuffer[2] = (byte)(v >>> 40);
writeBuffer[3] = (byte)(v >>> 32);
writeBuffer[4] = (byte)(v >>> 24);
writeBuffer[5] = (byte)(v >>> 16);
writeBuffer[6] = (byte)(v >>> 8);
writeBuffer[7] = (byte)(v >>> 0);
out.write(writeBuffer, 0, 8);
incCount(8);
}
both using a private buffer field for the writing before squirting it all out, and not bothering to mask the lower 8 bits. It seems strange that writeLong appears optimised, but writeInt and writeShort are not.
What does everyone else think? Are there any other heavy users of DataOutputStream out there that would rather have things written faster? I guess i'm going to be writing my own version of DataOutputStream in the meantime, because we're writing so much data over these and i'm in an industry where milliseconds matter.

