Problem on Linux with terminated client and Selector.select()
RE: Fedora Core 4
java version "1.5.0_06"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05)
Java HotSpot(TM) Server VM (build 1.5.0_06-b05, mixed mode)
All,
I'm running into a problem on Linux that I cannot recreate on Windows or Mac OS X. We have a server that accepts client connections (primarily from desktop apps) using NIO. The connections are made once and left open as long as the client continues to run (i.e. this is not a web server like scenario where connections are made and torn down on each request). We're encountering a problem on only Linux that happens when the client app is killed (terminated from Task Manager or killed from the command line with a kill -kill). I'm sure the first comment from most is why would this be happening regularly, and my answer is that it is not, but it can happen, and, when we tested for it, we found problems. Specifically, once a client has connected, we register a selection key with read interest on the channel. We then have a thread that sits in a while loop and does the following:
int readyChannelsCount = this.selector().select();
if(readyChannelsCount > 0) {
Iterator selectedKeys = this.selector().selectedKeys().iterator();
... process the channels that are ready
The issue on linux is that, if we kill a client that is not currently in the middle of a read or a write, no IOException is thrown on the server, and the select() keeps returning with zero ready channels. On Windows and Mac OS X, the select returns a ready channel for the killed client, and the selection key says that it is ready for a read. When the server tries to read from the channel, an IOException is thrown, and we shut down the connection properly. This is exactly what I want. However, as I say, on Linux, the behavior is that the readyChannelsCount is always zero and the select returns everytime we call it, so, we waste a ton of CPU cycles processing "phantom" selects. What I've noticed on Linux is that all of the selection keys that correspond to the killed clients are valid (selectionKey.isValid() == true), but they all have their interest ops and their ready ops set to zero (I have no clue how they get into this state). I was thinking of implementing a workaround to this problem that involves iterating through all the selector's keys to see if any of the selection keys are in this state where they have no interest or ready ops. I would do this only when select returns zero. If anyone has any thoughts on this problem or my proposed workaround, I would really appreciate hearing them.

