dead sun blade 1000
Hi all,
I appreciate the soln to this might be "go get a support contract", but
suggestions short of that would be very welcome.
One of our SunBlade 1000s fails on power-up with "Data Access Error".
set-defaults, reset-all, setenv diag-switch? true -- these and others
all result in a "Fast Data Access MMU Miss".
Boot fails:
ok boot
FATAL: system is not bootable, boot command is disabled
The only commands that work are help, printenv, banner and
viewing & navigating the device tree (dev, cd, ls, words).
I've pulled all the cards + disks, swapped RAM from a working system,
and swapped the CPU from a working system. No benefit.
banner shows
--
ok banner
, No Keyboard
Copyright 1998-2004 Sun Microsystems, Inc. All rights reserved.
OpenBoot 4.13.0, 0 MB memory installed, Serial #0.
Ethernet address 0:0:0:0:0:0, Host ID: 00000000.
The IDPROM contents are invalid
--
(what's supposed to be to the left of the comma in the first line
above, BTW?)
Swapping PROM from a working system does not correct this.
Removing the PROM entirely allows extended POST to complete,
and after lots of RAM-scrubbing, it all looks fine.
printenv shows some invalid entries, specifically
security-mode 12 (invalid value) No default
security-passwordNo default
security-#badlogins205287023 No default
#power-cycles 205287023 No default
The mkp and mkpl commands that google shows up for fixing the
PROM are unknown to the OBP, and I can't see them in the output of
'words' either.
I'm guessing the motherboard is knackered, but would be very
grateful, not to mention impressed, if anyone has any fixes to offer.
Thanks in advance,
kidari
[1871 byte] By [
kidari..] at [2007-11-26 11:09:57]

# 1
Hello,
the Blade 1000 uses an SEEPROM. Inserting (and programming !) this chip into a system with OBP 3.x (eg. Ultra 5) won't work , it's a completely different type of chip.
This an excerpt from the Sun System Handbook:
"Beginning with the Sun Blade 1000, the IDPROM functionality is controlled by a Serial EEPROM. Separate circuitry controls the Time of Day clock."
I would suggest that you try the SEEPROM from the malfunctioning system in a good system (pull the disks to prevent that the system tries to boot). If it shows the same behaviour, the SEEPROM might be the cause.
(what's supposed to be to the left of the comma in the first line
above, BTW?)
Due to the invalid IDPROM contents the OBP can't determine the system type (Blade 1000 or Blade 2000), if it's a workstations or a server (F280R) is determined by the presense of a RSC card (none detected=workstation, present=server). The information left to the comma is the system type (and number of cpus).
Sample from the Service Manual:
Sun Blade 1000 2 (2 X UltraSPARC-III), Keyboard Present
OpenBoot 4.0, 256 MB memory installed, Serial #12134241.
Ethernet address 8:0:20:b9:27:61, Host ID: 80b92761.
A member of a German Sun Homeuser site got a spare IDPROM for below 100 Euro ($120). The part is pre-programmed but has a different HOSTID. If you use node-locked software, you have to decide if it's less expensive to obtain a new license key from the vendor or have Sun program HOSTID.
One last question: There are at least 3 kidari
http://forum.java.sun.com/profile.jspa?userID=101836
http://forum.sun.com/jive/profile.jspa?userID=41992
http://forum.sun.com/jive/profile.jspa?userID=82779
plus the one from the "old" Sun Forums (supportforum.sun.com).
Is this one person ?
Michael
# 2
Hi Michael,
Thank you for the helpful reply.
Yes, all the kidari with different amounts of trailing full stops are me --
well spotted!
I've been 'kidari' (no dots) since 1999, but periodically LDAP consolidation
at Sun has forced me to choose a new screen name "because the one
you are using is already in use"!The last time I pointed out that that it
was in use by _me_, SDN support actually paid attention, so there
shouldn't be any more kidari around.
Anyhow, back to the question at hand.SEEPROM and IDPROM mean
the same thing, right?
Sun UK list new IDPROMs for GBP28 (Euro 18?), so I will just buy another
one if your suggestion about trying the 'bad' one in a good system confirms
that it _is_ the bad one. The price is very different to the one you quoted,
though, so I'm not sure it's the same part. Part no. 525-1788 is what I
have in mind, listed here uder IDPROM/SEEPROM in the system handbook:
http://sunsolve.sun.com/handbook_private/Systems/SunBlade1000/components.html
What I've been swapping with another machine is about
10x13mm, with a yellow barcoded sticker on top.
When it's out, the system certainly complains about no PROM.
We've no licences to worry about: it's just a jumpstart server.
Thanks again for the suggestion of 'one last swap'! :-)
cheers,
kidari
kidari.
kidari..
&c
# 3
Hello,
525-1788 is the correct part (inserts into U201).
Due to the fact that the OBP lacks the commands to re-program the "fixed" part of the IDPROM data, it must be pre-programmed.
Sun has forced me to choose a new screen name "because the one
you are using is already in use"!
I did register November 11, 1999 on the Sun Developer Forums as maalatft. On the Sun Forums (supportforum.sun.com) I used maal. When these forums were migrating into the Developer Forums this shorter screen name was already taken too.
Michael
# 4
The PROM from the bad machine works fine in the good machine,
and the PROM from the good machine does not work (same
behaviour as before) in the bad machine.
Given that we've already swapped CPUs + RAM, I guess it's
the motherboard/PROM cradle that's frazzled. <sigh>
Last call for other suggestions?
Thanks,
kidari
# 5
Did you run the diagnostics with serial console (to view the diag output) ?
From the Blade 1000 Service Manual:
Stop-D Functionality
The Stop-D (diags) key sequence is not supported on systems with USB keyboards, however, the Stop-D functionality can be closely emulated by using the power button double-tap (see Stop-N Functionality), since this temporarily sets diag-switch?
Stop-N Functionality
1. After turning on the power to your system, wait until the front panel power button LED begins to blink and you hear an audible beep.
2. Quickly press the front panel power button twice (similar to the way you would double-click a mouse).
A screen similar to the following is displayed to indicate that you have successfully reset NVRAM contents to the default values.
Michael
# 6
Hi Michael,
Yes, I attempted that, but the problem with the new USB 'double-click'
functionality is that it only works if you get as far as the "boot beep".
The Blade was failing long before that stage with the PROM in,
and although it ran extensive POST with no PROM, it never got as
far as the "boot beep".
Oh well.
Thank you for all of your suggestions,
cheers/tchuess,
kidari