ASCII85 algorithm description on Wikipedia is wrong?

Here is the description from Wikipedia of what to do when encoding the last bytes:

> At the end of the data, the last group can have fewer than four bytes. Virtual

> zero bytes are appended to the data, and after encoding, if there was one byte,

> only two characters are output; if there were two bytes, three characters are

> output; and if there were three bytes, four characters are output. The "z" case

> does not apply here. This way, the length of the original data can be

> determined by a reader of the encoded data.

Suppose the input is:

byte[] data = new byte[] { 23, 24, 25, 26, 27 };

Then, the first 4 bytes to encode are: { 23, 24, 25, 26, 27 } which becomes: (Dn#.

Then, the next 4 byte to encode are: { 27, 0, 0, 0 } which becomes: )Z

So the total encoding is: (Dn#.)Z

Now, suppose we want to convert the string back to bytes. We take 5 characters at

a time.

The first 5 characters are: (Dn#. which becomes: { 23, 24, 25, 26 }

There are only 2 characters left: )Z which becomes: { 26 }

I calculated this by:

")" = 41 - 33 = 8

"Z" = 90 - 33 = 57

long x = 85^4 * 8 + 85^3 * 57; // x = 452610125 = 00011010111110100100100001001101

As you can see, the first 8 bits are: 00011010 which equals 26, NOT 27

So, I think the description on Wikipedia is incorrect.

[1421 byte] By [rkippena] at [2007-11-27 1:41:31]
# 1

This is interesting. I followed the link in http://en.wikipedia.org/wiki/ASCII85 to the Java source at http://java.freehep.org/jcvslet/JCVSlet/list/freehep/freehep/org/freehep/util/io and downloaded the package. Using the library and the following I obtained the same result as you did! With your test data the last byte is corrupt!

static byte[] encode85(byte[] data) throws Exception

{

ByteArrayOutputStream baos = new ByteArrayOutputStream();

ASCII85OutputStream os = new ASCII85OutputStream(baos);

os.write(data);

os.close();

return baos.toByteArray();

}

static byte[] decode85(byte[] data) throws Exception

{

ByteArrayInputStream bais = new ByteArrayInputStream(data);

ASCII85InputStream is = new ASCII85InputStream(bais);

ByteArrayOutputStream baos = new ByteArrayOutputStream();

byte[] buffer = new byte[1024];

for (int count = 0; (count = is.read(buffer)) != -1;)

{

baos.write(buffer,0, count);

}

baos.close();

return baos.toByteArray();

}

P.S. Please don't look too closely at my code - I just threw it together.

sabre150a at 2007-7-12 0:57:01 > top of Java-index,Other Topics,Algorithms...
# 2

As far as I can tell, the encoding is correct, but the there is a trick in the decoding that is not mentioned.

The standard case is 5 characters are read to produce 4 bytes.

Suppose those characters get read into an array:

String s = ...; // turn this string into byte array

int n = s.length();

int i = 0;

int k = 0;

int[] arr = new int[5];

arr[k++] = (i < n ? s.charAt(i) - 33 : 0);

arr[k++] = (i < n ? s.charAt(i+1) - 33 : 0);

...

If (k < 5) arr[k] = 85;

// adding the value of 85 accounts for the truncation problem

long x = arr[0] * 52200625 + arr[1] * 614125 + arr[2] * 7225 + arr[3] * 85 + arr[4];

rkippena at 2007-7-12 0:57:01 > top of Java-index,Other Topics,Algorithms...
# 3
> As far as I can tell, the encoding is correct, but> the there is a trick in the decoding that is not> mentioned.This still means that the Java source I downloaded has a bug! If you have no objection I will email the author.
sabre150a at 2007-7-12 0:57:01 > top of Java-index,Other Topics,Algorithms...
# 4

That's a good idea. It was released in 2001, yikes!

Here is the driver to test the code:

import java.io.*;

public class Tester {

private static boolean equal(byte[] d1, byte[] d2) {

if (d1.length != d2.length) return false;

for (int i = 0; i < d1.length; i++) {

if (d1[i] != d2[i]) return false;

}

return true;

}

public static void main(String[] args) throws Throwable {

byte[] b = new byte[1];

for (int i = 0; i < 256; i++) {

b[0] = (byte) i;

String s = encode(b);

byte[] _b = decode(s);

if (!equal(b, _b))

throw new RuntimeException("failed: " + i + " " + _b[0]);

}

b = new byte[2];

for (int i = 0; i < 256; i++) {

b[0] = (byte) i;

for (int j = 0; j < 256; j++) {

b[1] = (byte) j;

String s = encode(b);

byte[] _b = decode(s);

if (!equal(b, _b))

throw new RuntimeException("failed: " + i + " " + j);

}

}

b = new byte[3];

for (int i = 0; i < 256; i++) {

b[0] = (byte) i;

for (int j = 0; j < 256; j++) {

b[1] = (byte) j;

for (int k = 0; k < 256; k++) {

b[2] = (byte) k;

String s = encode(b);

byte[] _b = decode(s);

if (!equal(b, _b))

throw new RuntimeException("failed: " + i + " " + j + " " + k);

}

}

}

System.out.println("all passed");

}

static String encode(byte[] data) throws Exception

{

ByteArrayOutputStream baos = new ByteArrayOutputStream();

ASCII85OutputStream os = new ASCII85OutputStream(baos);

os.write(data);

os.close();

return new String(baos.toByteArray(), "UTF-8");

}

static byte[] decode(String s) throws Exception

{

ByteArrayInputStream bais = new ByteArrayInputStream(s.getBytes("UTF-8"));

ASCII85InputStream is = new ASCII85InputStream(bais);

ByteArrayOutputStream baos = new ByteArrayOutputStream();

byte[] buffer = new byte[1024];

for (int count = 0; (count = is.read(buffer)) != -1;)

{

baos.write(buffer,0, count);

}

baos.close();

return baos.toByteArray();

}

}

rkippena at 2007-7-12 0:57:01 > top of Java-index,Other Topics,Algorithms...
# 5
http://java.sun.com/j2se/1.5.0/docs/api/java/util/Arrays.html#equals(byte[],%20byte[])since 1.2.Pete
pm_kirkhama at 2007-7-12 0:57:01 > top of Java-index,Other Topics,Algorithms...