advanced bit masking across bytes
Hi,
Does an algorithm such as this exist in Java?
I have a byte array that has been populated after reading in a binary file. I would like to at a different specific points in the loaded byte array grab sets of bits.
The algorithm would be something like this
Starting at element x in the byte array , skip 3 bits and grab the next 10 bits. Return those 10 bits to the caller.
Doing so will require me to grab bits from byte[x+1] so I would think the steps would be something like this:
byte tempByte1;
byte tempByte2;
int result;
//
//Since a byte has 8 bits, and i want to skip the first 3 then:
//
int bitsRemaining = 10;
tempByte1 =byte[x] & (Math.pow(2,(8-3))-1);// Keep last 5 bits
bitsRemaining -= 5;
//
// at this point i have 5 bits leftover (left justified) to grab from the next element
//
tempByte2 =byte[x+1] >> (8-bitsRemaining );
//
// 8 bits / byte minus # bits remaining (special case if more than 8 bits remaining)
//
tempByte2 &= (Math.pow(2,(bitsRemaining))-1);
result = ( tempByte1 << 5) | (tempByte2);
// result is now a 10 bit number
Does Java have such an algorithm already built in?
if not, I'd like to generalize this so I can just pass the start offset in the byte array, the sub bit starting offset, and the number of bits I want.
I do know I will never be grabbing more than 32 bits at a time.
[2076 byte] By [
plumonea] at [2007-9-29 8:04:55]

public static void getBits(Byte[] input,int startBit, int numBits)
{
int startByte = startBit>>3;
int startByteOffset = startBit-(startByte<<3);
int endByte = (startBit+numBits)>>3;
int endByteOffset = (startBit+numBits)-(endByte<<3);
boolean[] output = new boolean[numBits];
// handle first byte
byteToBoolArray(output,0,input[startByte],startBit,7)
// iterate over remaining whole bytes
int outOffset = 7-startBit;
for(int i=startByte+1;i<endByte;i++)
{
byteToBoolArray(output,outOffset,input,0,7);
outOffset = OutOffset<<3;
}
if(endByte!=startByte)
{
byteToBoolArray(output,outOffset,input[endByte],0,endByteOffset);
}
return output;
}
private static final boolean[] byteToBoolArray(boolean[] output,int offset, byte val,int startBit,int endBit)
{
if((offset>output.length)||(offset<0)||(startBit<0)||(startBit>7)||(endbit<0)||(endBit>7)) return null;
if(startBit>endBit)return null;
int tmp = (val<<startBit);
for(int i=startBit;i<=endBit;i++)
{
output = (val&0xF0);
val=val<<1;
}
return output;
}
[/code]
I've not tested it.
matfud>
hummm. that didn't go so well
public static void getBits(Byte[] input,int startBit, int numBits)
{
int startByte = startBit>>3;
int startByteOffset = startBit-(startByte<<3);
int endByte = (startBit+numBits)>>3;
int endByteOffset = (startBit+numBits)-(endByte<<3);
boolean[] output = new boolean[numBits];
// handle first byte
byteToBoolArray(output,0,input[startByte],startBit,7)
// iterate over remaining whole bytes
int outOffset = 7-startBit;
for(int i=startByte+1;i<endByte;i++)
{
byteToBoolArray(output,outOffset,input[i],0,7);
outOffset = OutOffset<<3;
}
if(endByte!=startByte)
{
byteToBoolArray(output,outOffset,input[endByte],0,endByteOffset);
}
return output;
}
private static final boolean[] byteToBoolArray(boolean[] output,int offset, byte val,int startBit,int endBit)
{
if((offset>output.length)||(offset<0)||(startBit<0)||(startBit>7)||(endbit<0)||(endBit>7)) return null;
if(startBit>endBit)return null;
int tmp = (val<<startBit);
for(int i=startBit;i<=endBit;i++)
{
output[i] = (val&0xF0);
val=val<<1;
}
return output;
}
I've not tested it.
matfud>
that should have beenoutput[i] = (val&0x80);
> that should have been> > output[i] = (val&0x80);> does that mean you tested it?Isn't the definition of boolean jvm dependent?I'll take a look at the code. and see how it works.David
No I haven't tested it I was trying to mask out everything but
the leftmost bit of the byte (and 0xFF most assuridely does not do that)
Theres at least one typo in there as well.
Not sure. I thought that anything other then 0 was classed as true.
It probably wont work without a cast anyway.
so try
output[i] = (((val&0x80)&0xFF)==0)?false:true;
matfud
A fixed and tested version. I started noticing more and more mistakes
in the first few postings. here you go.
import java.util.*;
/**
* @copyright 2003 public domani
* @author Matthew Fudge (matfud)
* @version 1.0 beta
*/
public class BinaryTester
{
/** Creates a new instance of BinaryTester */
public BinaryTester()
{
}
/**
* returns a boolean[] containing <code>numBits</code> from the byte[] <code>input</code>
* starting at bit <code>startBit</code>. If <code>numBits</code> cannot be obtained it returns as many
* bits as possible. returns null on failure.
*/
public static boolean[] getBits(byte[] input,int startBit, int numBits)
{
// general tests
if((startBit<0)||(startBit>=(input.length<<3))||(numBits<=0)) return null;
if((startBit+numBits)>(input.length<<3))numBits = (input.length<<3)-startBit;
int startByte = startBit>>3;
int startByteOffset = startBit-(startByte<<3);
int endByte = ((startBit+numBits-1)>>3);
int endByteOffset = (startBit+numBits)-((endByte)<<3)-1;
boolean[] output = new boolean[numBits];
// handle first byte
if(byteToBoolArray(output,0,input[startByte],startByteOffset,7)==null) return null;
// iterate over remaining whole bytes
int outOffset = 8-startByteOffset; //where to write in the output array
for(int i=startByte+1;i<endByte;i++)
{
if(byteToBoolArray(output,outOffset,input[i],0,7)==null) return null;
outOffset = outOffset+8;
} if(endByte!=startByte)
{
if(byteToBoolArray(output,outOffset,input[endByte],0,endByteOffset)==null) return null;
}
return output;
}
/**
* inserts values from a given byte ><code>val</code> into a boolean array starting at index <code>offset</code>
* starting at bit <code>startBit</code> of the byte and ending at bit <code>endBit</code>. If endBit and start
* bit are the same one bit is written. If the output would have overwritten the end of the array it is truncated.
* if anything else goes wrong it return null without modifying the array.
*/
public static final boolean[] byteToBoolArray(boolean[] output,int offset, byte val,int startBit,int endBit)
{
if((offset>=output.length)||(offset<0)||(startBit<0)||(startBit>7)||(endBit<0)||(endBit>7)) return null;
if(startBit>endBit)return null;
int tmp = (val<<startBit);
int length = offset+endBit-startBit;
length = (length>=output.length-1)?output.length-1:length;
for(int i=offset;i<=length;i++)
{
output[i] = ((tmp&0x80)==0)?false:true;
tmp=tmp<<1;
}
return output;
}
/**
* @param args the command line arguments
*/
public static void main(String[] args)
{
byte[] test = new byte[3];
test[0] = (byte)0xFF; // try replacing these with 0xAA or 0x55 for pretty patterns
test[1] = (byte)0xFF;
test[2] = (byte)0xFF;
boolean[] output = new boolean[10];
// System.out.println(boolArrayToString(byteToBoolArray(output,9,test[0], 0, 0)));
//test failures
System.out.println(boolArrayToString(getBits(test, 0, 0))); // return null
System.out.println(boolArrayToString(getBits(test, 0, -1))); // return null
System.out.println(boolArrayToString(getBits(test, 24, 1))); // return null
System.out.println(boolArrayToString(getBits(test, -1, 1))); // return null
//test other stuff
System.out.println();
System.out.println(boolArrayToString(getBits(test, 8, 8))); // return 8 set bits
System.out.println(boolArrayToString(getBits(test, 0, 16))); // return 16 set bits
System.out.println(boolArrayToString(getBits(test, 1, 16))); // return 16 set bits
System.out.println(boolArrayToString(getBits(test, 8, 16))); // return 16 set bits
System.out.println(boolArrayToString(getBits(test, 8, 17))); // return 16 set bits (exceeded end of array)
}
public static String boolArrayToString(boolean[] input)
{
StringBuffer sb = new StringBuffer();
if(input==null) return "null";
for(int i=0;i<input.length;i++)
{
sb.append(((input[i])?1:0)+" ");
}
return sb.toString();
}
}
matfud>
> output = (((val&0x80)&0xFF)==0)?false:true;Why not ((val&0x80)!=0) ?Pete
You could try this... (I didn't have time to test, but it's pretty simple.)
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
public class GrabBits
{
public static long getBits(byte[] arrData, int iBitOffs, int iBits)
{
int iByteOffs = iBitOffs / 8;
int iShiftBits = iBitOffs % 8;
ByteBuffer buff = ByteBuffer.wrap(arrData, iByteOffs, 1+(iBits/8));
buff.order(ByteOrder.BIG_ENDIAN);
long lBits64 = buff.getLong();
lBits64 <<= iShiftBits;
return lBits64;
}
}
why not? didn't think thats why.matfud
>You could try thisI don't think that quite works. It leaves the resulting number in theleftmost bits of the long. I think it then needs to be shiftedright (without sign extension) which shouldn't be to hard to do.matfud
> You could try this... (I didn't have time to test, but
> it's pretty simple.)
>
> > import java.nio.ByteBuffer;
> import java.nio.ByteOrder;
...
I tried it out and added a last mask to keep only the bits I'm interested in. However it never makes it that far. It compiles and when runs (java GrabBits 13 6) it gets an underflow exception (see below)
[code]
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
public class GrabBits
{
public static int getBits(byte[] array, int iBitOffs, int iBits)
{
int iByteOffs = iBitOffs / 8;
int iShiftBits = iBitOffs % 8;
int distance = 1 + (iBits/8);
System.out.println(" BitOffset[" + iBitOffs + "], #Bits[" + iBits + "]");
System.out.println(" ByteOffset[" + iByteOffs + "], ShiftBits[" + iShiftBits + "]");
System.out.println(" distance[" + distance + "] array.length[" + array.length + "] (Note: ByteOffset must be < array.length)");
ByteBuffer buff = ByteBuffer.wrap(array, iByteOffs, distance);
System.out.println(" did the wrap");
buff.order(ByteOrder.BIG_ENDIAN);
System.out.println("Set to big endian");
int lBits32 = buff.getInt();
System.out.println("lbits32 prior to shift [0x" + Integer.toHexString( lBits32) + "]");
lBits32 <<= iShiftBits;
System.out.println("lbits32 after shift [0x" + Integer.toHexString( lBits32) + "]");
lBits32 = lBits32 & (int)(Math.pow(2,iBits)-1);
System.out.println("lbits32 after mask [0x" + Integer.toHexString( lBits32) + "]");
return lBits32;
}
public static void main(String[] args)
{
byte[] dog = new byte[100];
int i;
for (i = 0; i < 100; i++)
{
dog[i] = (byte) 'A';
dog[i] += (byte) i;
}
System.out.println("GrabBits test skip[" + args[0] + "] bits, then keep next " + args[1] + " bits ");
for (i = 0; i < 9; i++)
{
System.out.println("dog[" + i + "]= 0x" + Integer.toHexString((int)dog[i]));
}
int result = GrabBits.getBits(dog, Integer.parseInt(args[0]),Integer.parseInt(args[1]));
System.out.println("Result of offset " + args[0] + " then next " + args[1] + " is " + Integer.toHexString(result));
}
}
here is what I see on output
# java GrabBits 13 6
GrabBits test skip[13] bits, then keep next 6 bits
dog[0]= 0x41
dog[1]= 0x42
dog[2]= 0x43
dog[3]= 0x44
dog[4]= 0x45
dog[5]= 0x46
dog[6]= 0x47
dog[7]= 0x48
dog[8]= 0x49
BitOffset[13], #Bits[6]
ByteOffset[1], ShiftBits[5]
distance[1] array.length[100] (Note: ByteOffset must be < array.length)
did the wrap
Set to big endian
Exception in thread "main" java.nio.BufferUnderflowException
at java.nio.Buffer.nextGetIndex(Buffer.java:404)
at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:330)
at GrabBits.getBits(GrabBits.java:24)
at GrabBits.main(GrabBits.java:55)
It's blowing out on the .getInt() call.
My array length is 100. I don't see how I'm getting underflow as the problem. I changed the code you posted so it would compile and to fit my problemset (ie from 64 bits to 32 bits).
right now i'm leaning towards rkconner's solution.
However I'm still getting the underflow problem when
I try to make the getInt() call.
If this can be solved, I'll give out the rest of the duke dollars.
I awarded partially so far based on the responses.
Thanks
The reason I made it a long was so that it would handle all 32 bits in the shift operation. You've then got the bits in the leftmost position of the long (the highest 32 bits).
I'd try correcting it back to long, and then if you want to
only return 32 bits in an int, you can mask the long...
int iBits32 = (((lBits32 & 0xFFFF0000L) >> 16) & 0xFFFF;
..and then return(iBits32) *should* do it.
D'oh! And to fix the BufferUnderflowException, makedistance = 8+(iBits/8);8 being the size of a long, so the buffer needs to have at leastthat many bytes in it.
Yes that appears to have fixed it.I'm just double checking my shifts to make it do what i now need I think its in good shape for the most part.ThanksDavid