RMI Connection Refused, but the socket is bound and listening?
Setting: I have two servers (primary and secondary), and numerous clients that connect to those servers (10-15). The two servers communicate with each other to know if/when the other goes down so as to resume the "primary" responsibility if necessary.
While the primary is trying to connect to the secondary, I get a connection exception indicating "Connection Refused". The odd thing, is that I can initiate a TCP connection directly to the RMI server from the "other" server (or at least its registry).
My current hypothesis is that it抯 connecting to the registry, but is unable to establish the secondary TCP connection for it抯 object communications (which runs on a different arbitrarily assigned port from the OS).
Interestingly, if you look at the 憀sof?output from the second server, it already has a number of connections to the primary server, indicating that it is at some level able to establish communications.
I did try setting -Djava.rmi.server.hostname=<server ip> for each host, but that didn抰 seem to do anything.
I抳e tried to provide as much information as I can below. Let me know if there抯 anything else I can provide.
Thoughts?
Thanks....
Info about server1
TCP Connection Initiation Test (to the registry)
dba@oocs01-ctl:/companyx/dfs/master1 > telnet oocs02-ctl.companyx.com 10112
Trying 192.168.111.150...
Connected to oocs02-ctl.companyx.com (192.168.111.150).
Escape character is'^]'.
^]
telnet> quit
Connection closed.
dba@oocs01-ctl:/companyx/dfs/master1 >
The Stack Trace:
2007-01-24 11:00:52:949:SEVERE:24:Async operation threw ([rmi://oocs02-ctl.companyx.com:10112/if1]) - Could not get connection - EXCEPTION: dfs.exceptions.DFSException:
dfs.master.MasterProxy.runOp(MasterProxy.java:174)
dfs.master.MasterProxy.getRole(MasterProxy.java:191)
dfs.master.MasterImp.checkMasterRoles(MasterImp.java:805)
dfs.master.MasterImp.run(MasterImp.java:1195)
java.lang.Thread.run(Thread.java:595)
Connection refused to host: 192.168.111.150; nested exception is:
java.net.ConnectException: Connection refused - CAUSED BY - EXCEPTION: java.rmi.ConnectException:
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:574)
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:185)
sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:171)
sun.rmi.server.UnicastRef.invoke(UnicastRef.java:94)
java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:179)
java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:132)
$Proxy0.getRole(Unknown Source)
dfs.master.commands.AsyncGetMasterRole.call(AsyncGetMasterRole.java:15)
dfs.master.commands.AsyncGetMasterRole.call(AsyncGetMasterRole.java:8)
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
java.util.concurrent.FutureTask.run(FutureTask.java:123)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
java.lang.Thread.run(Thread.java:595)
Connection refused - CAUSED BY - EXCEPTION: java.net.ConnectException:
java.net.PlainSocketImpl.socketConnect(Native Method)
java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
java.net.Socket.connect(Socket.java:519)
java.net.Socket.connect(Socket.java:469)
java.net.Socket.<init>(Socket.java:366)
java.net.Socket.<init>(Socket.java:179)
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:569)
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:185)
sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:171)
sun.rmi.server.UnicastRef.invoke(UnicastRef.java:94)
java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:179)
java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:132)
$Proxy0.getRole(Unknown Source)
dfs.master.commands.AsyncGetMasterRole.call(AsyncGetMasterRole.java:15)
dfs.master.commands.AsyncGetMasterRole.call(AsyncGetMasterRole.java:8)
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
java.util.concurrent.FutureTask.run(FutureTask.java:123)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
java.lang.Thread.run(Thread.java:595)
On the second server
dba@oocs02-ctl:/companyx/dfs/master2 > netstat -an | grep 10112
tcp00 0.0.0.0:101120.0.0.0:*LISTEN
tcp00 192.168.111.150:10112192.168.111.149:53208ESTABLISHED
dba@oocs02-ctl:/companyx/dfs/master2 > /usr/sbin/lsof -p 6585 | grep oocs01
java6585 dba7u IPv4 1633986187TCP oocs02-ctl.companyx.com:60637->oocs01-ctl.companyx.com:25322 (ESTABLISHED)
java6585 dba31u IPv4 1634340305TCP oocs02-ctl.companyx.com:60634->oocs01-ctl.companyx.com:59265 (ESTABLISHED)
java6585 dba35u IPv4 1633986509TCP oocs02-ctl.companyx.com:10112->oocs01-ctl.companyx.com:53208 (ESTABLISHED)
java6585 dba39u IPv4 1633986862TCP oocs02-ctl.companyx.com:60679->oocs01-ctl.companyx.com:44487 (ESTABLISHED)
dba@oocs02-ctl:/companyx/dfs/master2 >
Log output on second server, it seems to be binding.
2007-01-24 10:30:48:957:INFO:10:Binding to:'rmi://oocs02-ctl.companyx.com:10112/if1'

