Ultra 60 won't boot after kernel patch installation : Fast data acess MMU miss

Hi people,

It seems i'm having a great problem here.

I was installing kernel patch 118558-16 while suddendly a kernel panic

(the patch was not fully installed yet) automatically sent the server to sync and reboot.

Now at boot i get a "Fast data acess MMU miss" just after the OS banner saying

"Solaris 9 118558-16 ...."

Now i'm re-downloading the solaris 9 installation cd to try to rescue the system.

I'm plannnig to boot from cd in single user mode and try to backout the patch.

Any advice about this? There is a chance to chroot and backout the patch

or maybe to try to re-install it while booting from cd?

Can someone please help?

Thank you

Roberto

[762 byte] By [Sirius] at [2007-11-25 23:06:34]
# 1

I would try the following first:

1. Try and boot the system into single user mode from the disks (boot -s from the ok prompt).

2. Try and boot the server into single user mode from the Solaris 1 of 2 cdrom (boot cdrom -s from the ok prompt).

If the above all fail (which I expect they will) then it is likely to be a memory or CPU related issue. In which case it is a long drawn out process of taking the system down to minimum config of 1 bank of memory and 1 cpu and building it up from there until you find the failing component(s).

stumoor at 2007-7-5 17:57:45 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 2
Of course making sure you use the correct ESD procedures.
stumoor at 2007-7-5 17:57:45 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 3

Tomorrow i'll go where the server is hosted and try that but

what about the kernel patch half installed?

If the system will boot from cdrom in sigle user there is a quick way to reinstall the patch? (maybe a thing like a chroot environement?)

OPB test-all command will tell me if there is an HW problem?

I remember i had the same error with a solaris 8 install CD, and the problem was

that the kernel files where corrupt. So i tought this time the problem is that the kernel patch is only half applied.

Well.. we will just see tomorrow...

Thank you for the help!

If something else come to your mind just post it here!

Thank you

Roberto

Sirius at 2007-7-5 17:57:45 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 4
You are right that you can see this issue with kernel problems but it is more frequent with faulty CPU's or memory. Either way you first need to find out if the server will boot into either single user mode from the internal disks or from the cdrom.
stumoor at 2007-7-5 17:57:45 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 5

Hi,

system booted in single user from cdrom i mounted and chrooted all the partitions

and strated transferring important files trough scp, no panics or problems in about

4 hours of work.

Oh if happens to someone else to chroot the system for rescue remember to do this:

umount /etc/mnttab

mkdir /mnt

mount /dev/dsk/[rootfilesystem] /mnt

mount /dev/dsk/[usr filesystem] /mnt/usr

chroot /mnt /bin/bash

mount -F mntfs mnttab /etc/mnttab

mount -o remount /

mount -o remount /usr

mount /proc

mountall

The /etc/mnttab is really important as whitout it pkg* command won't work

and patch* commands too.

Well... let go back to my current problem..

/var/sadm/install/contents was corrupted but i recovered it easily then i re-installed

the 118558-16 patch but i still get the MMU error.

Then i tried again booting from disks and BANG > Fast Data Access MMU Miss

Now i'm planning to try removing all patched applied the day of the crash and see what happens...

Oh... anybody knows a way to boot from disks but while loading kernel

from cdrom?

Keep you updated...

Roberto

Sirius at 2007-7-5 17:57:45 > top of Java-index,Sun Hardware,Workstations - General Discussion...
# 6

You say that you were able to boot the system from cdrom into single user. Were you also able to boot the server into single user from the disks? Or like the full boot you mentioned from the disks did it return a Fast Data Access MMU Miss.

The steps that you are taking to remove the patches that have been applied that day seems logical to me and I would be interested to find out if the server will boot after you have removed them.

You said that you recovered the contents of the corrupt /var/sadm/install/contents directory, how did you go about doing this?

It is possible however that the OS image is just to far gone to be recovered and it might be worth considering reinstalling. But until you do so don't rule out a possible hardware issue.

Keep us updated.

stumoor at 2007-7-5 17:57:45 > top of Java-index,Sun Hardware,Workstations - General Discussion...