HADB corruption + recovery
We've had a situation occur where a dual-node HADB installation has become corrupted, but is refusing to be cleared/deleted when the servers have been rebooted.
After rebooting both instances, the ma agent processes have come back online. Problem is, the database appears corrupted and the usual procedure to reinitialize (clear) it isn't working, as both nodes complain that the clear can't be performed as the HADB is about to undergo recovery:
$ ./hadbm status test
hadbm:Error 22012: The management agent at host localhost is not ready to execute the operation, since it is about todo repository recovery. Please make sure that a majority of the management agents in the domain are running, and retry the operation later.
$ ./hadbm clear test
Please enter the passwordfor the database system user:*********
Please retype the passwordfor database system user:*********
WARNING: The --dbpassword option is deprecated since it is insecure. Usingthis option can compromise your password. Please use either the command prompt or the --dbpasswordfile option.
hadbm:Error 22012: The management agent at host localhost is not ready to execute the operation, since it is about todo repository recovery. Please make sure that a majority of the management agents in the domain are running, and retry the operation later.
Trouble is, the recovery -never- happens. It seems eternally stuck in this state, and I'm not sure what to do next. Do I manually need to blow away the devices, history and configuration files, after stopping the ma processes? It seems like a pretty poor solution, or a bug in HADB that should be fixed.
For reference, both nodes are running HADB 4.4.2-20 on Solaris 10 (SPARC).

