ASCII85 algorithm description on Wikipedia is wrong?
Here is the description from Wikipedia of what to do when encoding the last bytes:
> At the end of the data, the last group can have fewer than four bytes. Virtual
> zero bytes are appended to the data, and after encoding, if there was one byte,
> only two characters are output; if there were two bytes, three characters are
> output; and if there were three bytes, four characters are output. The "z" case
> does not apply here. This way, the length of the original data can be
> determined by a reader of the encoded data.
Suppose the input is:
byte[] data = new byte[] { 23, 24, 25, 26, 27 };
Then, the first 4 bytes to encode are: { 23, 24, 25, 26, 27 } which becomes: (Dn#.
Then, the next 4 byte to encode are: { 27, 0, 0, 0 } which becomes: )Z
So the total encoding is: (Dn#.)Z
Now, suppose we want to convert the string back to bytes. We take 5 characters at
a time.
The first 5 characters are: (Dn#. which becomes: { 23, 24, 25, 26 }
There are only 2 characters left: )Z which becomes: { 26 }
I calculated this by:
")" = 41 - 33 = 8
"Z" = 90 - 33 = 57
long x = 85^4 * 8 + 85^3 * 57; // x = 452610125 = 00011010111110100100100001001101
As you can see, the first 8 bits are: 00011010 which equals 26, NOT 27
So, I think the description on Wikipedia is incorrect.
[1421 byte] By [
rkippena] at [2007-11-27 1:41:31]

# 1
This is interesting. I followed the link in http://en.wikipedia.org/wiki/ASCII85 to the Java source at http://java.freehep.org/jcvslet/JCVSlet/list/freehep/freehep/org/freehep/util/io and downloaded the package. Using the library and the following I obtained the same result as you did! With your test data the last byte is corrupt!
static byte[] encode85(byte[] data) throws Exception
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ASCII85OutputStream os = new ASCII85OutputStream(baos);
os.write(data);
os.close();
return baos.toByteArray();
}
static byte[] decode85(byte[] data) throws Exception
{
ByteArrayInputStream bais = new ByteArrayInputStream(data);
ASCII85InputStream is = new ASCII85InputStream(bais);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
for (int count = 0; (count = is.read(buffer)) != -1;)
{
baos.write(buffer,0, count);
}
baos.close();
return baos.toByteArray();
}
P.S. Please don't look too closely at my code - I just threw it together.
# 2
As far as I can tell, the encoding is correct, but the there is a trick in the decoding that is not mentioned.
The standard case is 5 characters are read to produce 4 bytes.
Suppose those characters get read into an array:
String s = ...; // turn this string into byte array
int n = s.length();
int i = 0;
int k = 0;
int[] arr = new int[5];
arr[k++] = (i < n ? s.charAt(i) - 33 : 0);
arr[k++] = (i < n ? s.charAt(i+1) - 33 : 0);
...
If (k < 5) arr[k] = 85;
// adding the value of 85 accounts for the truncation problem
long x = arr[0] * 52200625 + arr[1] * 614125 + arr[2] * 7225 + arr[3] * 85 + arr[4];
# 4
That's a good idea. It was released in 2001, yikes!
Here is the driver to test the code:
import java.io.*;
public class Tester {
private static boolean equal(byte[] d1, byte[] d2) {
if (d1.length != d2.length) return false;
for (int i = 0; i < d1.length; i++) {
if (d1[i] != d2[i]) return false;
}
return true;
}
public static void main(String[] args) throws Throwable {
byte[] b = new byte[1];
for (int i = 0; i < 256; i++) {
b[0] = (byte) i;
String s = encode(b);
byte[] _b = decode(s);
if (!equal(b, _b))
throw new RuntimeException("failed: " + i + " " + _b[0]);
}
b = new byte[2];
for (int i = 0; i < 256; i++) {
b[0] = (byte) i;
for (int j = 0; j < 256; j++) {
b[1] = (byte) j;
String s = encode(b);
byte[] _b = decode(s);
if (!equal(b, _b))
throw new RuntimeException("failed: " + i + " " + j);
}
}
b = new byte[3];
for (int i = 0; i < 256; i++) {
b[0] = (byte) i;
for (int j = 0; j < 256; j++) {
b[1] = (byte) j;
for (int k = 0; k < 256; k++) {
b[2] = (byte) k;
String s = encode(b);
byte[] _b = decode(s);
if (!equal(b, _b))
throw new RuntimeException("failed: " + i + " " + j + " " + k);
}
}
}
System.out.println("all passed");
}
static String encode(byte[] data) throws Exception
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ASCII85OutputStream os = new ASCII85OutputStream(baos);
os.write(data);
os.close();
return new String(baos.toByteArray(), "UTF-8");
}
static byte[] decode(String s) throws Exception
{
ByteArrayInputStream bais = new ByteArrayInputStream(s.getBytes("UTF-8"));
ASCII85InputStream is = new ASCII85InputStream(bais);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
for (int count = 0; (count = is.read(buffer)) != -1;)
{
baos.write(buffer,0, count);
}
baos.close();
return baos.toByteArray();
}
}