Unicode Characters & RC4-HMAC

I just stumbled over an issue w/ unicode characters in passwords.

I extracted my machine's account password using the windows LSARetrievePrivateData API using the Win32 Python Extensions. Result: was a unicode string with one catch: it contained the character '\ude09', a lower surrogate character with no higher surrogate in front. I don't know whether this is a Python issue, an issue with the auto-generated password or what. The password is not a valid unicode string.

Using this string in JGSS fails pre-authentication because the UTF-16LE encoder in sun.security.krb5.internal.crypto.dk.DkCrypto#charToUtf16 doesn't like the sequence and inserts an "error" sequence FDFF.

If however, I use the following encoding, authentication against our PDC works fine:

staticbyte[] charToUtf16(char[] chars){

ByteBuffer buffer = ByteBuffer.allocate(2 * chars.length).order(ByteOrder.LITTLE_ENDIAN);

buffer.asCharBuffer().put(chars);

return buffer.array();

}

This is agnostic of surrogates and maybe closer to what the RFC describes:

"Each Windows UNICODE character is encoded in little-endian format of 2 octets each."

Maybe someone who's in this a little deeper than me can judge whether DkCrypto should be changed.

Thanks

Matthias

[1504 byte] By [matthias.ernsta] at [2007-11-27 9:46:58]
# 1

Kerberos V is not internationalized. Applications are expected to provide

ASCII characters with Kerberos, which is in compliance with the specification.

Non-ASCII username/passwords are not supported under the present definition.

There is an ongoing effort at IETF to address this.

Seema

Seema-1a at 2007-7-12 23:58:20 > top of Java-index,Security,Kerberos & Java GSS (JGSS)...