Hardware Monitoring on Sun Fire X2100 M2
Hello all,
An annoying problem with a brand new Sun Fire X2100 M2 server.
At power on, the fans come on full speed (sounds like a jet). The normal behaviour is to slow down to an idle speed, within about 15 seconds, but I believe because of the hardware monitoring being disabled (sensor readings are 0.00), the system fans won't slow down to idle speed.
The Service Processor says that all Hardware Monitoring is disabled:
Blower Fan 0(disabled)
Blower Fan 1(disabled)
Axial Fan 0(disabled)
Temperature Status: CPU Temp(disabled)
Ambient Temp(disabled)
Voltage Status: Vcc 12V(disabled)
DDRP1 1.8V(disabled)
Vcc 3.3V(disabled)
Vcc 5V(disabled)
Vcc 3.3V STB(disabled)
/>SP -> show SystemInfo/Fan/Fan1
...
Status = disabled
CurrentValue = 0.000;
...
/>SP -> show SystemInfo/Temperature/Temperature1
...
Status = disabled
CurrentValue = 0.000;
...
/>SP -> show SystemInfo/Voltage/Voltage1
...
Status = disabled
CurrentValue = 0.000;
...
So... Does anyone know how to enable hardware monitoring?
Has anyone experienced this before?
No leds indicate a hardware error. All cables seem to be connected.
I also tried upgrading the BIOS & BMC (1.80/S40_3A05), without luck.
[1387 byte] By [
snejk] at [2007-11-26 11:34:39]

# 1
Hi Snejk,we just received 5 Sun X2100 M2's and all of them have monitoring enabled by default. Which version of the Embedded LOM are you running (the latest one is 1.80)?Nico
Nico at 2007-7-7 3:50:28 >

# 2
Hi Nico,
Yes, running ELOM v1.80 without luck. I've been in contact with Sun who will replace the main system board. The funny thing is that everything except the hardware monitoring seem to be ok, ie no led indicate a hardware error.. and the server is installable and usable, but who wants a server without monitoring capabilities.
/Snejk
snejk at 2007-7-7 3:50:28 >

# 3
I think the quality of the X2100 M2 isn't up to "Sun-standard" (or what I believed to be "Sun-standard" :-/ ). I just got 5 X2100 M2's, with 2 of them I am unable to reach the SP over the network (tried all obvious things like using another network cable/connection, checking settings, resetting...). I have a call open for that too but didn't get any useful response yet.
Nico
Nico at 2007-7-7 3:50:28 >

# 4
Hmm. Weird... I have no problem connecting to the SP over the network. The SP IP is 192.168.1.2 by default. Are you connecting via SSH or HTTPS?By the way, replacing the system board fixed the monitoring problem...
snejk at 2007-7-7 3:50:28 >

# 5
Even ping doesn't work. I've set the ip-address by hand via the serial line (as none of the v1.20 were able to get a dhcp address) I used exactly the same commands as on the other servers, but for these it didn't work.
Conversation with SUN support:
me: I am unable to connect to the SP via the network, I tried....
Sun: please upgrade to the latest lom/bios version
me: how can I do that without a local cdrom and without network activity
Sun: euh... what did you do exactly to configure it
me: set /SP/AgentInfo IpAddress .....
Sun: ah so I see you were able to configure the network so please use the webgui...
me: aaargh, that's the whole point it doesn't work
I wonder whether they actually read the initial call reports :-/
The call has now been escalated to an expert. My best guess: I need a new motherboard :-)
Nico
Nico at 2007-7-7 3:50:28 >

# 6
too bad, it sounds like the BMC software is corrupted. Does the NIC indicate link (green)? Are you using original memory sticks from Sun? I heard you can run into trouble if you're not.
Maybe you can reinstall the SP software by following the procedure "Recovering from a corrupt SP", Chapter 5, page 77 in the ELOM Administration Guide.
Hope it works :-)
/Snejk
snejk at 2007-7-7 3:50:28 >

# 7
The network connector itself seems fine. I configured the pre-installed Solaris 10 using a localy connected screen and keyboard. I can reach the network via bge1 (I believe that's the one which doubles up as net management interface) without problems. Very strange. I was actually planning on doing the recovery procedure today as the service call at Sun doesn't seem to be going anywhere :-(
I'll let you know if it worked.
Nico
Nico at 2007-7-7 3:50:28 >

# 8
Hey Nico,Did you get this thing to work yet?There's a new BIOS and firmware out, maybe it will help. I'm still having random problems with the Hardware Sensors.. sometimes they go into disabled mode. Resetting the BMC helps this time./Snejk
snejk at 2007-7-7 3:50:28 >

# 9
I just upgraded BIOS & BMC (1.91/S40_3A07) .
These servers got a serious issue with the hardware monitoring. Now Axial Fan 0 disappered from monitoring:
/SP -> show SystemInfo/Fan/Fan3
/SP/SystemInfo/Fan/Fan3
Targets:
Properties:
Designation = Axial Fan 0
Status = not present
CurrentValue = 0.000;
LowWarningValue = 0.000;
LowCriticalValue = 1952.000;
The rest seem ok for now.. but who knows?
snejk at 2007-7-7 3:50:28 >

# 10
Hey Nico,About your networking issue. Make sure you've configured the Gateway correctly in AgentInfo Gateway=x.x.x.xI had a similar problem like you, when I noticed the Gateway was set to 255.255.255.0, it didnt work remotely. probably my fault... remote network works ok
snejk at 2007-7-7 3:50:28 >

# 11
Are the interfaces on the X2100 LOM 10Mb like some other Sun servers? If you follow the old-fashioned practice of hardcoding speed and duplex on your switch ports you may have trouble.
That being said, we test-drove some X4100s a few months ago and the LOMs crashed repeatedly. Since there's no power switch on the LOM we had to yank power to cycle the LOMs and get them to come back to life.
And on our V210s the LOM would crash periodically (but still be pingable) until be upgrdaed to V1.5.5.
# 12
I upgraded the servers to 1.91 using a bootable usbstick but that doesn't make any difference. The Sun guy promised he will send somebody to replace the motherboards but I haven't heard anything from Sun since.
I also upgraded my 3 other more-or-less-working x2100M2's and guess what: one of them is suddenly claiming it's axial fan is failing. On one of the other servers I also see the fans do strange things. The temperature readings seem ok so I guess/hope it's just the SP messing up.
Sigh...
Nico at 2007-7-7 3:50:29 >

# 13
I'm very sure all network settings are correct (I checked that a number of times and compared it with a working system). The NIC's are gigabit and are connected to a gigabit switch (which is set to autonegotiate). I tried with a cable from one of the working servers but that didn't make any difference. So unless the port was fixed to a certain speed when it came from the factory it shouldn't be a problem. I don't think there is a way to fix this anyway from the SP.
I vaguely remember there is a way to boot a x4200's SP to a linux prompt in stead of the lousy SP console. I wonder whether this is possible for the X2100M2. After all it's just booting Linux anyway. Of course I can't find anything about it anymore so I may just have dreamed it :-).
Nico
Nico at 2007-7-7 3:50:29 >

# 14
"I also upgraded my 3 other more-or-less-working x2100M2's and guess what: one of them is suddenly claiming it's axial fan is failing. On one of the other servers I also see the fans do strange things."
Axial fan 0 broke (fail state) when upgrading to 1.91. I also got Blower Fan 0 to fail after running about 30 minutes with only power on (no OS booted). It started making a "whining" noise, I really dont like this.
I also have some weird stuff in the event logs, some CPU drop warnings and a bad CMOS warning (since the last upgrade).
snejk at 2007-7-7 3:50:29 >

# 15
I finaly convinced Sun to replace the motherboards of the 2 systems where the SP was unreachable and guess what... network worked without a problem with the new motherboard.
Still waiting for a solution for the virtual cdrom. I guess this may be an issue on my side (all 5 servers now have that problem). A Sun guy will come by on Friday to look at it.
Nico
Nicoa at 2007-7-21 15:33:09 >

# 16
Regarding Axial Fan 0 indication. I got the same thing when upgrading to v1.91. Plugging fan into white connector (designated for Axial Fan 1) clears up the issue. Despite being plugged into the plug for axial fan 1, it shows axial fan 0 as OK. go figure. Sun tech support is trying to track it down for me. Did anyone get this resolved?
wwf_jr
# 17
wwf_jr,Did you try the latest firmware?Firmware: BIOS - 3A15, SP - 2.91I didn't try it yet, but I hope it will solve it.
snejka at 2007-7-21 15:33:09 >

# 18
Firmware Version : 2.7Release Date: 1/08/200718852 When firmware is updated to 2.50/3A11 Axial fan is fail It should be fixed. Lets try it.
snejka at 2007-7-21 15:33:09 >

# 19
Still not fixed!Axial fan 0 is still failing.Anyone else have this problem?
snejka at 2007-7-21 15:33:09 >
