State database replicas

Hello all

I have a solaris 9 server with 3 disks:

the 1st one holds the OS in several slices

the other 2 make up a mirror drive i created using the management console.

I created 3 database replicas, one on each disk, the master was on c1t0d0s5 (OS Disk, unused slice) and the other 2 on c1t1d0s0 (disk part of mirror) and c1t2d0s0 (disk part of mirror).

I just created a fs on c1t0d0s5 using newfs and started copying some files there when the system crashed, when it came back up i noticed the database replica there was damaged so i deleted it and created a new one and stopped using this slice altogether!

My questions:

- In case of a disk failure on the OS disk, is it possible to replace it, reinstall solaris and then "plug" back the array using the database replicas from the other 2 disks?

- Shouldn't the kernel itself protect the db replicas so that they don't get damaged when using the slice?

- Is it possible to create another database replica on the slice holding the / filesystem without damaging it?

Thanks in advance,

Billy

[1132 byte] By [billypg] at [2007-11-25 23:02:01]
# 1

This configuration sounds a bit unusual, planning what I call a "root mirror" would involve creating a seperate slice 5 and/or 6 on all disks that participate in the mirrored volume for state database replicas only. An ideal configuration might be to create a vtoc on the primary disk, including small slices ( dedicated to metadbs ) and copy the vtoc to the remaining drives that will take part in the volume. Of course this could prove tricky if the system is configured with just one drive and the additional drives are intended to give availability in a second phase. I would tend to dump the slices and implement a new strategy for the mirror in that scenario, starting from the beginning ( but this may not be a practical approach on a live system ). In the case of the first question, SVM would sync the data between the mirrored drives after configuration and obp is then programmed to boot in sequence: rootdisk rootmirr etc. so if a disk failed and crashed the system or begins to echo errors in the log, the system can be recovered to the next drive of the boot sequence configured in obp, manually or automatically. On the second point, the kernel relies on 50% of the metadbs, so if you configure accross a number of slices on seperate drives you can achieve better availability. I don't think you can create a metadb on a mounted slice. I am about to quit for today, so I'll root around and see if I can find a good reference tomorrow, there are some really good hits with google on SVM. Solaris+Volume+Manager

<a href="http://www.idevelopment.info/data/Unix/Solaris/UNIX_Solaris_home.shtml" target="_blank"> http://www.idevelopment.info/data/Unix/Solaris/UNIX_Solaris_ home.shtml</a>

mlennon at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...
# 2

Thanks for replying.

I didn't know I could make a sofware array even for the root partition. So that's why I only have one mirror. Here's my setup again:

1 disk for regular unix partitions such as / /etc /var /usr and so on.

2 disks in a mirror for a /data partition.

3 database replicas: one on each disk.

My first question was in the event of having to replace the first disk and reinstalling solaris from scratch. Since the root partition is not mirrored and the array is still intact on the 2 other disks (with 2 repicas of the state db). How would I go about using this array on a newly installed system? That is: how do I re-create the device to mount the existing array on the new system?

Thanks again

billypg at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...
# 3

Let me get this straight

root disk + metadb

datadisk + metadb

datadisk + metadb

You then want to be able to recreate disksuite should you lose the root disk, by taking the data from the metadb's still in existence......fraid not.

Lose the root disk and you have to recreate the disksuite after you have re-installed the OS. I would be looking at about 3 hours work in the office for this, but we have scripts to help us. After you have recreated the SDS setup you will need to mount your data again. The old metadb's are about as useful as a wet paper bag, delete them and recreate more. Also, it is useful to have three on each disks, just in case the block that you have written a metadb actually gets corrupted.

Best solution, find an external disk of the right size and mirror the root disk to this. Root disk without a mirror is just asking for trouble.

bmacdo at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...
# 4
Thanks Brian, I didn't have time to follow up on this one earlier today.
mlennon at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...
# 5
Been a long time Martin mate, will try and be more active this year, possibly if I stay out of the pub long enough to make a post that will happen :-)
bmacdo at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...
# 6

<table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText"><b>bmacdo wrote on Thu, 05 January 2006 15:19</b></td></tr><tr><td class="quote">

possibly if I stay out of the pub long enough to make a post that will happen :-)

</td></tr></table>

I was worried for a while that you had been off retraining to support Opteron hardware and Windows...

mlennon at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...
# 7

<table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText"><b>m-lennon wrote on Thu, 05 January 2006 14:22</b></td></tr><tr><td class="quote">

<table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText"><b>bmacdo wrote on Thu, 05 January 2006 15:19</b></td></tr><tr><td class="quote">

possibly if I stay out of the pub long enough to make a post that will happen :-)

</td></tr></table>

I was worried for a while that you had been off retraining to support Opteron hardware and Windows...

</td></tr></table>

Reboot, wait 10 minutes, ping box. Yeah had that four month training course :-)

bmacdo at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...
# 8
He he!
mlennon at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...
# 9

It doesn't matter if it takes 3 hours. I need to know if its possible to recreate the array and preserve the data already there after reinstalling the OS.

Why are those state db replicas useless? In fact, once my primary state db replica got corrupted and the system switched fine to one of the replicas on another disk.

Thanks

billypg at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...
# 10

I think this is the solution for reattaching the array should I have to reinstall solaris on the first disk. Let me know if I'm missing something and if this would work:

1. Restore the file /etc/lvm/md.cf (one can create a copy of it by typing metastat -p and redirecting the output to a file such as /etc/opt/SUNWmd/md.tab before the crash)

2. Use metainit -a to make the configuration active.

3. mount the array again

Does step 2 recreate the devices /dev/md/dsk and /dev/md/rdsk?

Thanks.

billypg at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...
# 11

Been away so that's why the late response, sorry.

Right, putting the replica db on your root disk is not going to work, it would have to be on your data disks. Reason is, the worst case scenario for you is to loose your root disk, hence you cannot get any data off it. No point in the copy being there.

You could put a copy on another system, but that really is the same thing as we discussed before, getting redundancy for your root disk. If you want it on the same system, put it on the data disks.

The scripts are only in essence text files you make executable, so keep them on a laptop/pc if you want, I couldn't care less where they are kept to be fair. You cannot however keep them on a disk that is toast.

I will look into the other metadb's being useful, but I would strongly suggest you take the time after three hours work to just nuke the buggers and re-create new ones. Think I can recreate three more on each disk in about 10-15 seconds.

YOUR DATA IS STILL SAFE (assuming the mirror was fine before the crash), you are just recreating SVM with a new Solaris build. As far as the /dev/dsk pathing goes, it will be the same as before because Solaris will pick up the disks in the same order (assuming you do not do anything like changing card positions within the server and adding new disks in the wrong place).

Think that should cover it.

bmacdo at 2007-7-5 17:51:27 > top of Java-index,Storage Forums,Storage General Discussion...