URGENT: Mirror Problem.
Our var partition filled last night and crashed the machine with various errors writing to the mirrored Disk. I was able to bring it up and clear some space on var, but now a metastat is reporting Needs Maintenance on both submirrors as below. This is present for all the mirrors.
Also on boot up, i get the follow errors:
WARNING: forceload of misc/md_trans failed
WARNING: forceload of misc/md_raid failed
WARNING: forceload of misc/md_hotspares failed
WARNING: forceload of misc/md_sp failed
I do not recall getting these before, but I am not overly famialiar with this machine. Any ideas on what is going on with the submirrors? Bad disk or just need to fix something that was caused by the full partition?
Any help would be great!
Thanks,
Darren
d0: Mirror
Submirror 0: d10
State: Needs maintenance
Submirror 1: d20
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 1027776 blocks
d10: Submirror of d0
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d0 c1t0d0s0 <new device>
Size: 1027776 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t0d0s0 0NoLast Erred
d20: Submirror of d0
State: Needs maintenance
Invoke: metareplace d0 c1t1d0s0 <new device>
Size: 1027776 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t1d0s0 0NoMaintenance
[1570 byte] By [
DarrenLCCa] at [2007-11-27 5:35:31]

# 2
I would have to confirm this but I believe we do.
Should I run the metareplace command as it says?
I am reading a post that suggests running:
#metareplace -e d6 c1t0d0s6
#metareplace -e d6 c1t1d0s6
of course this would be for the d6 mirror as mentioned below. I am not sure of what the metareplace command does not the -e option.
d6: Mirror
Submirror 0: d16
State: Needs maintenance
Submirror 1: d26
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 17339904 blocks
d16: Submirror of d6
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d6 c1t0d0s6 <new device>
Size: 17339904 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t0d0s6 0NoLast Erred
d26: Submirror of d6
State: Needs maintenance
Invoke: metareplace d6 c1t1d0s6 <new device>
Size: 17339904 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t1d0s6 0NoMaintenance
# 3
Just to confirm.. I am getting this error for all the mirrors as below. I just used the above ones as examples. Here is the complete metastat command.
raven:root bash / # metastat
d0: Mirror
Submirror 0: d10
State: Needs maintenance
Submirror 1: d20
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 1027776 blocks
d10: Submirror of d0
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d0 c1t0d0s0 <new device>
Size: 1027776 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t0d0s0 0NoLast Erred
d20: Submirror of d0
State: Needs maintenance
Invoke: metareplace d0 c1t1d0s0 <new device>
Size: 1027776 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t1d0s0 0NoMaintenance
d1: Mirror
Submirror 0: d11
State: Needs maintenance
Submirror 1: d21
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 32776896 blocks
d11: Submirror of d1
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d1 c1t0d0s1 <new device>
Size: 32776896 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t0d0s1 0NoLast Erred
d21: Submirror of d1
State: Needs maintenance
Invoke: metareplace d1 c1t1d0s1 <new device>
Size: 32776896 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t1d0s1 0NoMaintenance
d3: Mirror
Submirror 0: d13
State: Needs maintenance
Submirror 1: d23
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 30721344 blocks
d13: Submirror of d3
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d3 c1t0d0s3 <new device>
Size: 30721344 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t0d0s3 0NoLast Erred
d23: Submirror of d3
State: Needs maintenance
Invoke: metareplace d3 c1t1d0s3 <new device>
Size: 30721344 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t1d0s3 0NoMaintenance
d4: Mirror
Submirror 0: d14
State: Needs maintenance
Submirror 1: d24
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 30721344 blocks
d14: Submirror of d4
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d4 c1t0d0s4 <new device>
Size: 30721344 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t0d0s4 0NoLast Erred
d24: Submirror of d4
State: Needs maintenance
Invoke: metareplace d4 c1t1d0s4 <new device>
Size: 30721344 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t1d0s4 0NoMaintenance
d5: Mirror
Submirror 0: d15
State: Needs maintenance
Submirror 1: d25
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 30721344 blocks
d15: Submirror of d5
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d5 c1t0d0s5 <new device>
Size: 30721344 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t0d0s5 0NoLast Erred
d25: Submirror of d5
State: Needs maintenance
Invoke: metareplace d5 c1t1d0s5 <new device>
Size: 30721344 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t1d0s5 0NoMaintenance
d6: Mirror
Submirror 0: d16
State: Needs maintenance
Submirror 1: d26
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 17339904 blocks
d16: Submirror of d6
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d6 c1t0d0s6 <new device>
Size: 17339904 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t0d0s6 0NoLast Erred
d26: Submirror of d6
State: Needs maintenance
Invoke: metareplace d6 c1t1d0s6 <new device>
Size: 17339904 blocks
Stripe 0:
DeviceStart Block Dbase StateHot Spare
c1t1d0s6 0NoMaintenance
d7: Mirror
Submirror 0: d30
State: Okay
Submirror 1: d40
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 20972736 blocks
# 4
Well, assuming your machine is still up and running, it seems unlikely that all the disks have failed. So presumably svm has just got confused.
The way to clear the errors is
metareplace -e d0 c1t0d0s0
etc.
Of course if theres some hardware error that actually triggered the problems, they will just come back.
It could be a flakey controller or scsi cable etc.
You might want to examine iostat -En to see if your getting hard or transport errors.
And looking through /var/adm/messages is also useful.