Controlling HADB - thoughts
I'm looking to get some opinions from people as to how they control the startup/shutdown of their HADB databases in Appserver 8.x EE?
By default, the installer will install a startup script (/etc/init.d/ma-initd) which is invoked at various runlevels to start/stop the management agent for HADB.
However, while seems like only part of the problem. The other part is ensuring the database is started/stopped cleanly (on all nodes) when a system starts/stops.
We've been in the situation whereby an unclean shutdown of one node (or both) can cause the HADB to become corrupt. When this happens, it doesn't start properly, and the only way I've found to remedy it is to remove it altogether from the appserver config, which is extremely time consuming given that you need to bounce all instances to do this. Last time it took me upwards of three hours to restore a broken HADB.
Given this, it seems to me very important that the database is well maintained. So I'm trying to figure out what others do to stop/restart it, in terms of startup scripts of SMF. ie. not the ma component, but rather the hadbm commands. ie. hadbm stop <>
Problem is, you don't want one server to issue a "stop" when its shutdown, because presumably you want the other HADB server to continue to run. Also when one server is brought-up, issuing an hadbm start won't be appropriate if the database is already running.
Or is this a problem which is simply too hard to script? Should it all be manual - ie an administrator issuing "hadbm stopnode <>" on the node which is being stopped?
Any feedback or ideas on this would be appreciated. I'm finding HADB great when it works, but when it doesn't (or is corrupt) it becomes a real nightmare..
thanks
[1795 byte] By [
tourtecha] at [2007-11-27 4:29:49]

# 1
You can do the graceful shutdown of HADB by following these steps:
- list the databases using "hadbm list" command. Note that Appserver creates one database per cluster-- so there will be as many databases as appserver clusters.
- Execute "hadbm stop <dbname>"for each database.
- Then shutdown MA by executing "/etc/init.d/ma-initd stop"
Note: The above steps can be easily scripted to automate the graceful shutdown.
- thava
# 2
It is correct that you cannot (or should not) script the shutdown of an HADB database from one computer. Shutting down a database should be a rare event, and can thus be a manual task.
However, HADB is designed so that you should be able to shutdown a machine with a single HADB node, without corrupting the database. If properly scripted, the computer should automatically rejoin the database when it is restarted. You may stop the node before shutting down the computer, but it is strictly not required.
For best availability, do maintenance on one computer at a time, and make sure that the downtime is as short as possible. For improved availability, you may configure spare nodes that automatically take over when an active node is shut down. This requires two additional computers, though (one per DRU).
# 3
Thanks for the replies guys.
It looks like the safest thing to do is probably a combination of both - do manual shutdowns whenever scheduled maintenance is required, but as a safety, script a basic database stop on each node in /etc/rcX.d
Really appreciate the feedback as this is one area where the Sun documentation is clearly lacking.
As a side-note, isn't it odd that both the ma-initd and appserver have scripts as opposed to SMF manifests for Solaris 10? Given the push behind Sol10 that Sun has at the moment, I would have thought this would be a given.