Is DES (128bits) of only 16bytes in 200 miliseconds too slow?
We are developing an applet and testing it in a JCOP2.0 card (P8WE5033, has a coprocessor for DES).
We made a simple applet that takes 16 bytes, uses a internal constant for DES key (128bits) and then answers the APDU. We also measured almost the same time if we use the 16bytes as a constant inside the card: 200 miliseconds to 300 miliseconds. Depends on which DES we try.
We are communicating in 115200 bps. We can get cplc data ( 80 CA 9F 7F 2D ) in 10 miliseconds. We can select the applet in less than 10miliseconds. Seems to me that the problem is not on the communications side.
The P8WE5033 datasheet shows 100 microseconds to 200 microseconds using the DES coprocessor , around 1000 times faster !!!
Any ideas of what are we doing wrong ?
[780 byte] By [
samarma] at [2007-10-3 11:46:30]

1. How did you measure the time?2. Could you show the Applet code?
We measured the time interval between :
a) just before sending the first byte of the T = 1 block that requests the DES operation and returns the result.
b) just after receiving the last byte of the T = 1 answer containing the result.
We are using a proprietary board as a host and we have a clock running all the time, sub milisecond resolution.
I can send the applet later...
Thank you
Some news:
We tested the code posted in this forum:
http://forum.java.sun.com/thread.jspa?threadID=782072&tstart=60
It took 50seconds to run the "test2" of that post: 4096 times DES.
Thus, we could calculate 12 miliseconds for each DES operation, still 100 times slower than the 100 to 200 microseconds listed in the datasheet (P8WE5033) for the DES coprocessor.
Any Ideas?
> We measured the time interval between :
> a) just before sending the first byte of the T = 1
> block that requests the DES operation and returns
> the result.
> ) just after receiving the last byte of the T = 1
> answer containing the result.
>
The value specified in the datasheet is
START: set the register in the processor to start DES
END: value written into register
So if you start at sending the APDU, you have to consider the additional times:
- Reader --> smart card UART
- smart card UART --> smart card OS
- smart card OS --> DES crypto co-processor *
- DES crypto co-processor --> smart card OS *
- smart card OS --> smart card UART
- smart card UART --> Reader
The asteriks marked steps take the time specified in the data sheet.
But we can get cplc data ( 80 CA 9F 7F 2D ) in 10 miliseconds and can select the applet in less than 10miliseconds. We measured these intervals the same way we measured the 200miliseconds. So, I suppose that the opeations:
- Reader --> smart card UART
- smart card UART --> smart card OS
and
- smart card OS --> smart card UART
- smart card UART --> Reader
are done in less than 10 miliseconds. Remember, get cplc data returns 47+4 bytes.
Anyway , in my last post I reported we made a loop of 4096 DES operations. This way we can minimize the influence of the times spent on operations other than DES.
You must not forget that the OS may have to do internal state house-keeping depending on where the data you're operating on is located, where the keys are stored, etc. Also, the times in the processor data sheet are raw operation times, and do not include counter-measures implemented by the card OS.
If you post the complete applet source, we can give you more valuable input. The code referenced in the other post is not necessarily conclusive. For example, calls to Cipher.init() may write EEPROM, as the Java Card Specification requires this object to have a persistent state!
OK, for the two functions below, we measured 130ms for the first function and 30ms for the second. T = 1, 115200bps. If we modify the second function commenting out the mDES.doFinal(...) and output some constant, then the time drops from 30ms to below 10ms.
Thank you for any comments !
public void setEncryptKey2()
{
DESKey gDESKey = (DESKey) KeyBuilder.buildKey( KeyBuilder.TYPE_DES, KeyBuilder.LENGTH_DES3_2KEY, false);
gDESKey.setKey(b, (short)0); // b = 16 bytes.
Cipher mDES = Cipher.getInstance( Cipher.ALG_DES_ECB_NOPAD, true);
mDES.init( gDESKey, Cipher.MODE_DECRYPT);
}
// mBuffKey = JCSystem.makeTransientByteArray((short)16, JCSystem.CLEAR_ON_DESELECT);
public void getCS(APDU apdu)
{
byte[] buf = apdu.getBuffer();
Util.arrayCopyNonAtomic(buf, ISO7816.OFFSET_CDATA, mBuffKey, (short)0, (short)4);
mBuffKey[5] = (byte)0x0F;
mDES.doFinal(mBuffKey, (short)0, (short)16, buf, (short)0);
apdu.setOutgoingAndSend((short)0, (short)16);
}
OK, so you already saw that Cipher.init() is expensive. 30ms doesn't seem too bad then in comparison with 10ms (factor 3). You could possibly speed it up a little by doing the crypto in-place to prevent the card OS from having to copy data around.
> You could possibly
> speed it up a little by doing the crypto in-place to
> prevent the card OS from having to copy data around.
I can't see how to do this, could you show it to me please?
Could this longer DES execution time due to storing some data in EEPROM rather than in RAM, so the extra time (100 times more...) due to writing to the EEPROM?
About the 30ms being enough: well, I had planned to execute DES many times cause I have more than 16 bytes to encrypt...16 is just a test...
In-place means that your input and output are in the same buffer and at the same offset.
I don't think you will see a linear time increase with larger block numbers. Give it a try.
Again, the OS has counter-measures implemented which slow the performance down. You must keep this in mind.