mountd dies - need help w/ clues to problem

mountd dies and can be restarted only to die again a hour or

so later.

I can not seem to find a cause, no changes have been made to

system since this started to occur.

Am looking at rpcinfo and nfsstat for clues (see below). No information in /var/adm/messages that relates to this issue. Have not changed any hardware or software nor added users.

Note the system is not loaded heavily at all and has a hardware RAID (100GB, 40% full), attached for several years. No hardware issues with the RAID as we have checked. Network activity to server is OK, via ping -s tests.

Have also checked switches and no errors reported.

Configuration is Ultrasparc server running Solaris 8, patches fairly up to date. . I am running the following tests on a regular basis.

echo 'NFS TEST'

rpcinfo -T udp quasar nfs

echo 'MOUNTD' TEST

rpcinfo -T udp quasar mountd

echo 'LOCKMGR' TEST

rpcinfo -T udp quasar nlockmgr

echo 'RPCINFO -P' TEST

quasar# ./rpctest

NFS TEST

program 100003 version 2 ready and waiting

program 100003 version 3 ready and waiting

MOUNTD TEST

program 100005 version 1 ready and waiting

program 100005 version 2 ready and waiting

program 100005 version 3 ready and waiting

LOCKMGR TEST

program 100021 version 1 ready and waiting

program 100021 version 2 ready and waiting

program 100021 version 3 ready and waiting

program 100021 version 4 ready and waiting

LLOCKMGR TEST

rpcinfo: RPC: Program not registered

RPCINFO -P TEST

program vers protoport service

1000004tcp111 rpcbind

1000003tcp111 rpcbind

1000002tcp111 rpcbind

1000004udp111 rpcbind

1000003udp111 rpcbind

1000002udp111 rpcbind

1000042udp1023 ypserv

1000041udp1023 ypserv

1000041tcp1023 ypserv

1000042tcp 32771 ypserv

1000073udp 32775 ypbind

1000072udp 32775 ypbind

1000071udp 32775 ypbind

1000073tcp 32772 ypbind

1000072tcp 32772 ypbind

1000691udp 32776

1000071tcp 32772 ypbind

1000691tcp 32773

1000091udp1022 yppasswdd

1000281tcp 32774 ypupdated

1000281udp 32777 ypupdated

1000241udp 32784 status

1000241tcp 32778 status

1001331udp 32784

1001331tcp 32778

10023210udp 32785 sadmind

1000111udp 32787 rquotad

1000022udp 32790 rusersd

1000023udp 32790 rusersd

1000022tcp 32794 rusersd

1000023tcp 32794 rusersd

1000211udp4045 nlockmgr

1000212udp4045 nlockmgr

1000213udp4045 nlockmgr

1000214udp4045 nlockmgr

1000121udp 32797 sprayd

1000081udp 32800 walld

1000012udp 32803 rstatd

1000013udp 32803 rstatd

1000014udp 32803 rstatd

1000831tcp 32822

1002211tcp 32826

1002351tcp 32830

1000211tcp4045 nlockmgr

1000212tcp4045 nlockmgr

1000213tcp4045 nlockmgr

1000214tcp4045 nlockmgr

1000682udp 32814

1000683udp 32814

1000684udp 32814

1000685udp 32814

1002291tcp 32852 metad

1002301tcp 32856 metamhd

1001531udp 32818

1000032udp2049 nfs

1000033udp2049 nfs

1002272udp2049 nfs_acl

1002273udp2049 nfs_acl

1000032tcp2049 nfs

1000033tcp2049 nfs

1002272tcp2049 nfs_acl

1002273tcp2049 nfs_acl

1500011udp844 pcnfsd

1500012udp844 pcnfsd

1500011tcp845 pcnfsd

1500012tcp845 pcnfsd

3005981udp 32901

3005981tcp 32901

8053063681udp 32901

8053063681tcp 32901

1002491udp 32902

1002491tcp 32907

12896370865tcp 32977

12896370861tcp 32977

1000051udp 33701 mountd

1000052udp 33701 mountd

1000053udp 33701 mountd

1000051tcp 35188 mountd

1000052tcp 35188 mountd

1000053tcp 35188 mountd

quasar# nfsstat

Server rpc:

Connection oriented:

callsbadcallsnullrecvbadlenxdrcalldupchecks 282213000039196dupreqs0 Connectionless:

callsbadcallsnullrecvbadlenxdrcalldupchecks 831 00001 dupreqs0

Server nfs:

callsbadcalls2825540 Version 2: (13 calls)

nullgetattrsetattrrootlookupreadlink13 100%0 0%0 0%0 0%0 0%0 0%readwrcachewritecreateremoverename0 0%0 0%0 0%0 0%0 0%0 0%linksymlinkmkdirrmdirreaddirstatfs0 0%0 0%0 0%0 0%0 0%0 0%Version 3: (282553 calls)

nullgetattrsetattrlookupaccessreadlink315 0%116851 41% 8138 2%44222 15%32410 11%0 0%readwritecreatemkdirsymlinkmknod37444 13%17666 6%2105 0%0 0%0 0%0 0%removermdirrenamelinkreaddirreaddirplus

1645 0%0 0%11 0%70 0%5555 1%9562 3%fsstatfsinfopathconfcommit65 0%224 0%593 0%5677 2%

Server nfs_acl:

Version 2: (0 calls)

nullgetaclsetaclgetattraccess0 0%0 0%0 0%0 0%0 0%Version 3: (0 calls)

nullgetaclsetacl0 0%0 0%0 0%

Client rpc:

Connection oriented:

callsbadcallsbadxidstimeoutsnewcredsbadverfs136100000 timerscantconnnomeminterrupts 0000 Connectionless:

callsbadcallsretransbadxidstimeoutsnewcreds310000 badverfstimersnomemcantsend0000

Client nfs:

callsbadcallsclgetscltoomany 1284112840 Version 2: (2 calls)

nullgetattrsetattrrootlookupreadlink0 0%1 50%0 0%0 0%0 0%0 0%readwrcachewritecreateremoverename0 0%0 0%0 0%0 0%0 0%0 0%linksymlinkmkdirrmdirreaddirstatfs0 0%0 0%0 0%0 0%0 0%1 50%Version 3: (1272 calls)

nullgetattrsetattrlookupaccessreadlink0 0%506 39%9 0%216 16%313 24%0 0%readwritecreatemkdirsymlinkmknod71 5%42 3%14 1%2 0%0 0%0 0%removermdirrenamelinkreaddirreaddirplus

20 1%1 0%3 0%6 0%7 0%10 0%fsstatfsinfopathconfcommit5 0%4 0%3 0%40 3%

Client nfs_acl:

Version 2: (1 calls)

nullgetaclsetaclgetattraccess0 0%0 0%0 0%1 100%0 0%Version 3: (9 calls)

nullgetaclsetacl0 0%9 100%0 0%

[6922 byte] By [walt] at [2007-11-25 22:41:18]
# 1
Hello... I have a problem with the nlockmgr service too... Please someone have a clue? Thks
saminator at 2007-7-5 14:17:52 > top of Java-index,General,Sun Networking Services and Protocols...
# 2
Is patch 111197-05 installed for BugID 4817833? "4817833 mountd randomly dumps core"
n0b0dy at 2007-7-5 14:17:52 > top of Java-index,General,Sun Networking Services and Protocols...