URGENT: Mirror Problem.

Our var partition filled last night and crashed the machine with various errors writing to the mirrored Disk. I was able to bring it up and clear some space on var, but now a metastat is reporting Needs Maintenance on both submirrors as below. This is present for all the mirrors.

Also on boot up, i get the follow errors:

WARNING: forceload of misc/md_trans failed

WARNING: forceload of misc/md_raid failed

WARNING: forceload of misc/md_hotspares failed

WARNING: forceload of misc/md_sp failed

I do not recall getting these before, but I am not overly famialiar with this machine. Any ideas on what is going on with the submirrors? Bad disk or just need to fix something that was caused by the full partition?

Any help would be great!

Thanks,

Darren

d0: Mirror

Submirror 0: d10

State: Needs maintenance

Submirror 1: d20

State: Needs maintenance

Pass: 1

Read option: roundrobin (default)

Write option: parallel (default)

Size: 1027776 blocks

d10: Submirror of d0

State: Needs maintenance

Invoke: after replacing "Maintenance" components:

metareplace d0 c1t0d0s0 <new device>

Size: 1027776 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t0d0s0 0NoLast Erred

d20: Submirror of d0

State: Needs maintenance

Invoke: metareplace d0 c1t1d0s0 <new device>

Size: 1027776 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t1d0s0 0NoMaintenance

[1570 byte] By [DarrenLCCa] at [2007-11-27 5:35:31]
# 1

Hi

The following messages are NORMAL, do not worry about them:

WARNING: forceload of misc/md_trans failed

WARNING: forceload of misc/md_raid failed

WARNING: forceload of misc/md_hotspares failed

WARNING: forceload of misc/md_sp failed

It simply means that those modules are not loaded they are not configured.

Do you have a valid backup for the /var?

goSolarisa at 2007-7-12 15:04:42 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 2

I would have to confirm this but I believe we do.

Should I run the metareplace command as it says?

I am reading a post that suggests running:

#metareplace -e d6 c1t0d0s6

#metareplace -e d6 c1t1d0s6

of course this would be for the d6 mirror as mentioned below. I am not sure of what the metareplace command does not the -e option.

d6: Mirror

Submirror 0: d16

State: Needs maintenance

Submirror 1: d26

State: Needs maintenance

Pass: 1

Read option: roundrobin (default)

Write option: parallel (default)

Size: 17339904 blocks

d16: Submirror of d6

State: Needs maintenance

Invoke: after replacing "Maintenance" components:

metareplace d6 c1t0d0s6 <new device>

Size: 17339904 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t0d0s6 0NoLast Erred

d26: Submirror of d6

State: Needs maintenance

Invoke: metareplace d6 c1t1d0s6 <new device>

Size: 17339904 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t1d0s6 0NoMaintenance

DarrenLCCa at 2007-7-12 15:04:42 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 3

Just to confirm.. I am getting this error for all the mirrors as below. I just used the above ones as examples. Here is the complete metastat command.

raven:root bash / # metastat

d0: Mirror

Submirror 0: d10

State: Needs maintenance

Submirror 1: d20

State: Needs maintenance

Pass: 1

Read option: roundrobin (default)

Write option: parallel (default)

Size: 1027776 blocks

d10: Submirror of d0

State: Needs maintenance

Invoke: after replacing "Maintenance" components:

metareplace d0 c1t0d0s0 <new device>

Size: 1027776 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t0d0s0 0NoLast Erred

d20: Submirror of d0

State: Needs maintenance

Invoke: metareplace d0 c1t1d0s0 <new device>

Size: 1027776 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t1d0s0 0NoMaintenance

d1: Mirror

Submirror 0: d11

State: Needs maintenance

Submirror 1: d21

State: Needs maintenance

Pass: 1

Read option: roundrobin (default)

Write option: parallel (default)

Size: 32776896 blocks

d11: Submirror of d1

State: Needs maintenance

Invoke: after replacing "Maintenance" components:

metareplace d1 c1t0d0s1 <new device>

Size: 32776896 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t0d0s1 0NoLast Erred

d21: Submirror of d1

State: Needs maintenance

Invoke: metareplace d1 c1t1d0s1 <new device>

Size: 32776896 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t1d0s1 0NoMaintenance

d3: Mirror

Submirror 0: d13

State: Needs maintenance

Submirror 1: d23

State: Needs maintenance

Pass: 1

Read option: roundrobin (default)

Write option: parallel (default)

Size: 30721344 blocks

d13: Submirror of d3

State: Needs maintenance

Invoke: after replacing "Maintenance" components:

metareplace d3 c1t0d0s3 <new device>

Size: 30721344 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t0d0s3 0NoLast Erred

d23: Submirror of d3

State: Needs maintenance

Invoke: metareplace d3 c1t1d0s3 <new device>

Size: 30721344 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t1d0s3 0NoMaintenance

d4: Mirror

Submirror 0: d14

State: Needs maintenance

Submirror 1: d24

State: Needs maintenance

Pass: 1

Read option: roundrobin (default)

Write option: parallel (default)

Size: 30721344 blocks

d14: Submirror of d4

State: Needs maintenance

Invoke: after replacing "Maintenance" components:

metareplace d4 c1t0d0s4 <new device>

Size: 30721344 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t0d0s4 0NoLast Erred

d24: Submirror of d4

State: Needs maintenance

Invoke: metareplace d4 c1t1d0s4 <new device>

Size: 30721344 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t1d0s4 0NoMaintenance

d5: Mirror

Submirror 0: d15

State: Needs maintenance

Submirror 1: d25

State: Needs maintenance

Pass: 1

Read option: roundrobin (default)

Write option: parallel (default)

Size: 30721344 blocks

d15: Submirror of d5

State: Needs maintenance

Invoke: after replacing "Maintenance" components:

metareplace d5 c1t0d0s5 <new device>

Size: 30721344 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t0d0s5 0NoLast Erred

d25: Submirror of d5

State: Needs maintenance

Invoke: metareplace d5 c1t1d0s5 <new device>

Size: 30721344 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t1d0s5 0NoMaintenance

d6: Mirror

Submirror 0: d16

State: Needs maintenance

Submirror 1: d26

State: Needs maintenance

Pass: 1

Read option: roundrobin (default)

Write option: parallel (default)

Size: 17339904 blocks

d16: Submirror of d6

State: Needs maintenance

Invoke: after replacing "Maintenance" components:

metareplace d6 c1t0d0s6 <new device>

Size: 17339904 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t0d0s6 0NoLast Erred

d26: Submirror of d6

State: Needs maintenance

Invoke: metareplace d6 c1t1d0s6 <new device>

Size: 17339904 blocks

Stripe 0:

DeviceStart Block Dbase StateHot Spare

c1t1d0s6 0NoMaintenance

d7: Mirror

Submirror 0: d30

State: Okay

Submirror 1: d40

State: Okay

Pass: 1

Read option: roundrobin (default)

Write option: parallel (default)

Size: 20972736 blocks

DarrenLCCa at 2007-7-12 15:04:42 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 4

Well, assuming your machine is still up and running, it seems unlikely that all the disks have failed. So presumably svm has just got confused.

The way to clear the errors is

metareplace -e d0 c1t0d0s0

etc.

Of course if theres some hardware error that actually triggered the problems, they will just come back.

It could be a flakey controller or scsi cable etc.

You might want to examine iostat -En to see if your getting hard or transport errors.

And looking through /var/adm/messages is also useful.

robert.cohena at 2007-7-12 15:04:42 > top of Java-index,Solaris Operating System,Solaris 10 Features...