Another Disk Disappeared!

This thing is killing me!

Sun Blade 150 1.25GB RAM / 2x 120GB HD, Solaris 10

While installing Tarantella on my system I got a /var file system full and the system crashed. Tried to boot -s and the system can't find boot device.

Last time I was able to get access to the drive again but no matter what I tried it would NOT boot. So I replaced the drive, restored from my backup disk and all was ok.

I tried to load Tarantella again and the same thing happened!

After this last time I ran a couple monitor mode commands and got the following results:

ok test-all

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/SUNW,<a href="mailto:m64B&#64;13" target="_blank">m64B@13</a>

Testing hardware registers - passed Ok

Testing RamDAC - passed Ok

Testing Frame buffer - passed Ok

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:ide&#64;d" target="_blank">ide@d</a>

ERROR: IDE Primary Command Block register4

SUMMARY : Obs=0xff Exp=0x01 XOR=0xfe Addr=0xfff50a04

DEVICE : /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:ide&#64;d" target="_blank">ide@d</a>

SUBTEST : selftest:pri-cmd-blk-reg-test4

MACHINE : Sun Blade 150 (UltraSPARC-IIe 650MHz)

SERIAL# : 57174777

DATE: 02/21/2006 11:24:34GMT

CONTROLS: diag-level=max test-args=

/<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:ide&#64;d" target="_blank">ide@d</a> selftest failed, return code = 1

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:usb&#64;c" target="_blank">usb@c</a>,3

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:firewire&#64;c" target="_blank">firewire@c</a>,2

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:network&#64;c" target="_blank">network@c</a>,1

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:pmu&#64;3" target="_blank">pmu@3</a>

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:ebus&#64;c" target="_blank">ebus@c</a>

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:usb&#64;c" target="_blank">usb@c</a>,3/<a href="mailto:keyboard&#64;4" target="_blank">keyboard@4</a>

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:isa&#64;7" target="_blank">isa@7</a>/<a href="mailto:serial&#64;0" target="_blank">serial@0</a>,2e8

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:isa&#64;7" target="_blank">isa@7</a>/<a href="mailto:serial&#64;0" target="_blank">serial@0</a>,3f8

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:isa&#64;7" target="_blank">isa@7</a>/<a href="mailto:dma&#64;0" target="_blank">dma@0</a>,0/<a href="mailto:parallel&#64;0" target="_blank">parallel@0</a>,378

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:isa&#64;7" target="_blank">isa@7</a>/<a href="mailto:dma&#64;0" target="_blank">dma@0</a>,0/<a href="mailto:floppy&#64;0" target="_blank">floppy@0</a>,3f0

Testing /<a href="mailto:pci&#64;1f" target="_blank">pci@1f</a>,0/<a href="mailto:ebus&#64;c" target="_blank">ebus@c</a>/<a href="mailto:flashprom&#64;0" target="_blank">flashprom@0</a>,0

ok

ok probe-ide

Device 0 ( Primary Master )

Not Present

Device 1 ( Primary Slave )

Not Present

Device 2 ( Secondary Master )

ATA Model: WDC WD1200JB-00GVC0

Device 3 ( Secondary Slave )

Not Present

ok

Does anyone know any issues with loading Tarantella (Solaris 8 build) on a Solaris 10 system?

I just loaded this product at work and was going to load it at home to get a bit smarter on the product. But at $80 a pop not going to try again! And won't be getting any smarter on Tarantella at home...

Any ideas would be appreciated.

Thanks,

Mike

[4924 byte] By [msargent] at [2007-11-25 23:08:01]
# 1

Mike has cross-posted this to the Solaris 10 forum.

<a href="http://forum.sun.com/thread.jspa?threadID=29176" target="_blank">http://forum.sun.com/thread.jspa?threadID=29176</a>

The question probably fits better, over there.

It appears to be a software issue, not hardware.

... a Solaris configuration question ...

--issue only appears when installing a particular program.

-- uncertain whether that program is compatible with Solaris 10.

-- /var filesystem filled during both attempts to install and the system crashed.

-- insufficient information whether the partition layout was modified before they attempted the second instal (make more room).

Bill at 2007-7-5 17:58:55 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 2

The main reason I posted to the hardware forum, in addition to the Sol 10 forum, was because I now have 2 dead hard drives (hardware).

I was curious if something in the error messages I posted may indicate a problem with the IDE controller on the motherboard. The drives were plugged to different connections - first dead drive was at c0t0d0 and the second at c0t2d0 -- both c0.

In the past I have seen software that will destroy hardware in Sun systems and curious if this may be the case here as well.

Thanks

Mike

msargent at 2007-7-5 17:58:55 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 3

Additionally:

Before running the install, both times, I confirmed more than enough space available.

After I was able to access the drive (not boot from it) I checked and the /var partition was NOT full and the inode usage was less that 20%

So do not believe space was the real issue with the system crash.

Thanks

msargent at 2007-7-5 17:58:55 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 4
Bill,just so you know... Sun bought Tarantella last summer... I imagine he is talking about the Sun Secure Global Desktop Software 4.2... looks interesting...haroldkarl
haroldkarl at 2007-7-5 17:58:55 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 5

<table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText"><b>msargent wrote on Tue, 21 February 2006 23:41</b></td></tr><tr><td class="quote">

Additionally:

Before running the install, both times, I confirmed more than enough space available.

After I was able to access the drive (not boot from it) I checked and the /var partition was NOT full and the inode usage was less that 20%

So do not believe space was the real issue with the system crash.

Thanks

</td></tr></table>

Can you run a format>analyze>read on the drives to confirm there is a hardware issue?

Bill, I'm just going to throw in a couple of questions to try to see if there is a hardware issue.

mlennon at 2007-7-5 17:58:55 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 6

Thanks, H.K.

I couldn't remember when the acquisition took place.

I've subsequently found the <a href="http&#58;&#47;&#47;www.sun.com/smi/Press/sunflash/2005-05/sun flash.20050510.1.html" target="_blank"><u>press release</u></a>.

Good, Martin.

Mike still didn't tell us how much was available in the /var partition.

Both failures were described in the initial post as happening right after an error message of "filesystem full".

Temporary files can magically disappear if their EOF isn't written.

It will be interesting to see if there is any solution.

Bill at 2007-7-5 17:58:55 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 7

The /var partition was configured with 20GB and when I checked it prior to the install it was only at 20% in use and i-node usage was only about 7%.

After I rebooted from the backup disk and mounted the partition it is still only at about 21% in use. And the i-node count is still well under 10%.

And yes SGD = Tarantella. I was trying to install v4.2.

Disk is still non-bootable. So my next thought was to redo the partitioning to see if writing the bootblock in a different location might help reclaim the drive.

Will also try the format->analyze->read to check the drive prior to the re-partitioning.

msargent at 2007-7-5 17:58:55 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 8

Could I ask which version of SOlaris 10 are you using ?

Is itSolaris 10 3/05 , Solaris 10 3/05 HW1 and Solaris 10 1/06 ?

Have you installed the latest OS patch ?

What is the OBP of your Machine ?

What is the firmware version of your HardDisk ?

Lastly What is the partition size of your /usr ? !!!

Try to maximize this filesystem /usr ( try making it 10 GB or BIGGER) since you have a huge Disk.

Good Luck !!!

NdRosario at 2007-7-5 17:58:55 > top of Java-index,Sun Hardware,Workstations - General Discussion...