SC 3.2 HA NFS error messages
Hello,
my HA NFS SC3.2 produces zillions of error messages :
Feb 12 23:54:38 myhost SC[SUNW.nfs:3.2,myrg,myrg-nfs,nfs_daemons_probe]:
[ID 176151 daemon.error] Unable to lookup nfs:nfs_server from kstat:
No such file or directory
Please help me to fix the problem.
TIA,
-- leon
[333 byte] By [
napobo3a] at [2007-11-26 18:13:17]

# 1
Can you verify the output of/usr/bin/kstat -m nfs -i 0 -n nfs_server -s callsshows usefull output?If not you would need to investigate why, thus opening a service request.GreetsThorsten
# 2
Hi Thorsten,the command /usr/bin/kstat -m nfs -i 0 -n nfs_server -s calls produced no output.The host is running snv_56, please take this into account.Thank you,-- leon
# 3
Erm, there we have the problem. SC 3.2 is not supported Nevada yet.
Note that Nevada (or Open Solaris) is a development version, which is per definition a moving target. Once it got released, SC will support it, but this might than be 3.next, not necessary 3.2GA.
You should use S10 11/06 (aka Update 3) with SC 3.2.
Greets
Thorsten
# 4
> Erm, there we have the problem. SC 3.2 is not
> supported Nevada yet.
I am pretty aware of this.
>
> Note that Nevada (or Open Solaris) is a development
> version, which is per definition a moving target.
> Once it got released, SC will support it, but this
> might than be 3.next, not necessary 3.2GA.
>
> You should use S10 11/06 (aka Update 3) with SC 3.2.
I know I should...but I need to use ZFS in my HA NFS cluster. ZFS runs faster in snv_56.
I am not asking for the official support, if anybody can help to understand the meaning of this error message, I'll appreciate. If not - I'll ignore it.
It doesn't disturb to the running SC3.2 but I prefer to know what's going on.
>
> Greets
> Thorsten
# 5
Actually I am surprised that it works so far for you :-) I would have expected more problems.
As you know Sun Cluster has quite some components living in kernel space, and while within a Solaris version (like S10) certain apis are guaranteed to stay stable, with going to the next release some can change - thus need a new set of binaries compiled for that version.
It is a bit like running the S9 binaries on a S10 system. The only luck you seem to have is that not enough changed within nevada to break it at a more disturbing point yet (like a panic while trying to boot).
The data service for nfs consists also of several compiled binaries, and the code does also check some kstat values. Obvioulsy this interface changed, since it does create the error message you see.
So I understand that zfs works faster/better for you, but you trade it against stabability. This might be ok for development, but I would not choose it for production.
Of course feel free to do it anyway - as long as you know what you are doing.
Greets
Thorsten
# 6
> Actually I am surprised that it works so far for you
> :-) I would have expected more problems.
I am glad I prepared you a good surprise :)
> As you know Sun Cluster has quite some components
> living in kernel space, and while within a Solaris
> version (like S10) certain apis are guaranteed to
> stay stable, with going to the next release some can
> change - thus need a new set of binaries compiled for
> that version.
>
> It is a bit like running the S9 binaries on a S10
> system. The only luck you seem to have is that not
> enough changed within nevada to break it at a more
> disturbing point yet (like a panic while trying to
> boot).
Again - I am well aware about all these aspects.
>
> The data service for nfs consists also of several
> compiled binaries, and the code does also check some
> kstat values. Obvioulsy this interface changed, since
> it does create the error message you see.
I was sure that somebody from SC staff is aware what's going on in OpenSolaris (aka Nevada) and will send me a hint. Maybe my post didn't get to the right person or - these two areas have no connections.
>
> So I understand that zfs works faster/better for you,
> but you trade it against stabability. This might be
> ok for development, but I would not choose it for
> production.
I know you wouldn't.
>
> Of course feel free to do it anyway - as long as you
> know what you are doing.
Anyway any idea on finding the problem will be appreciated.
Thanks,
-- leon
>
> Greets
> Thorsten
# 7
Leon,I've passed the thread on to the engineering group in the hope that someone might respond.Tim-
# 8
We build of course internally Sun Cluster 3.next on Nevada. So in order to get this working for you, you would need the Sun Cluster binaries build for Nevada.
But as of today we do not publish engineering builds from Sun Cluster. Both are moving targets naturally, until they got released.
Thus I am not sure what hint you expect :)
The code currently does a lookup for module nfs, name nfs_server, instance 0 and looks for the calls counter.
On Nevada there is no instance 0 for nfs_server. Instead there is instance 2, 3 and 4.
But the agent code as delivered with SC 3.2 checks just instance 0, and that fails, leading to the messages you see.
Greets
Thorsten