Disabling Alarms? - include group actions checkbox
Whats the easiest way to tell if this is checked/unchecked? This is how we disable hostdown notifications for specific boxes if we need downtime(if you have a better wayPLEASE let me know..other then renaming email.sh/breaking mail/setting up dup ip). The problem is joe adm will uncheck the box and forget to re-enable it again..then we will go a week or more before I come across it unless I go through 170+ agents' menus every day. Is this info stored in the db and if so which table(s)?
[506 byte] By [
hurtnn] at [2007-11-26 9:12:47]

# 1
Havent heard anything yet so I thought I might give this a bump. Am I missing something simple in temporarily disabling host down notifications or am I going about it entirely the wrong way? I'm at the point where with a large number of clients it seems almost impossible to tell what is actually being monitored and notified upon for host up/down status and what is not. Please inform me otherwise.
# 2
Hi hurtnn ,
Well, this isn't a free way, but it is a better way: use EventAction:
http://www.halcyoninc.com/products/EventAction/index.php
It will give you one place to enable/disable all your notifications.. turning off a host is as simple as adding a "-hostname" to a dialog box. Then you can easily check all the systems turned off in one screen, or check for -hostname entries in the EventAction configuration file (HALEventAction.dat)
You don't have to believe me, since I work for the company :) . The link above lets you download a 30-eval version to try out... you can run it in parallel to your existing opt-out-of-group-actions workflow and see if it makes things easier.
Regards,
Mike.Kirk@HalcyonInc.com
http://www.HalcyonInc.com
# 3
I've recently created a case with Sun on this and it seems I'm back to square one. What I have got from them is its pretty much a manual process. In my opinion this puts this product on the chopping block when it comes to any type of scalability. Imaging trying to manage 1000, 500, or even 200 nodes with SunMC and no way of knowing what you will be notified upon without right clicking on each agent on a regular basis. 200 right clicks/open menus? You can always go the route of never disabling them, but that destroys any type of action to take on an event, any type of integration with a ticket system, and the over all integrity of the alarms themselves. Remembering to re-enable them is not feasible either if your going to provide any type of delegation of control.
Mike, I do appreciate the response though and I'm sure Event Action would probably do the job, but there is no way that I can go before my peers asking for that kind of money to purchase 2 licenses for add-on products to provide basic functionality that SunMC should already have by default (we have 2 sunmc environments). I would get laughed at and I can抰 say I would blame them.
I've pretty much concluded that if I am (and I think I am according to the documentation) following the correct procedure to temporary disable alarms then this was either missed or disregarded during the development of this product. I pray that in a future release this gets addressed. Again I don抰 understand how anyone hasn抰 run into this before...unless I am just totally in the dark here.
I'm still not going to give up looking for the solution to this though. Since the master server is made aware of the configuration change on the agent then I should be able to capture that somehow.
Has anyone else had experience with this issue?
# 4
I haven't got a answer on this question... but have you tried to do something with the CLI function of SunMC?
# 5
Yeah, if it is in the cli..I couldn't find it. I played around with it a bit when I originally started looking for a solution to this problem to no avail.
What I have found so far is this. Upon unchecking the checkboxes in the Alarm Action menu which are ....
"Include Group Actions"
[ ] Host Not Responding
[ ] Host Responding
[ ] Agent Not Responding
[ ] Agent Responding
this will make entries in `ls -t /var/opt/SUNWsymon/cfg/topology+view* | head -1`. So with that you can....
ls -t /var/opt/SUNWsymon/cfg/topology+view* | head -1 | xargs grep '"0"' | sed -e 's/^.*-\(.*\)).*/\1/' | sort | uniq
which will give you the entity_id of each host that has had an "Include Group Actions" box unchecked. Then just translate your entity_id and that will give you the hostname that notifications were just disabled on.
This may not be a feasable workaround at all since I don't know how exactly the topology+view* files work. It seems like they are rotating each day as well..not sure though since I just discovered this yesterday. Im going to work on this a bit and eventually throw this in a webpage. The rotating logs though may cause issues...dunno at this point.
If anyone knows of a better way please let me know.
# 6
Copy of email sent to sun...
"I found a "working" solution to this problem. I discovered that when the include group boxes are checked/unchecked an entry is made in a corresponding /var/opt/SUNWsymon/cfg/topology+view* file on the master server. Armed with this information I created a cron job that runs a script which monitors those files for changes and retreives the entity_id of the host. Then the script will simply translate the id's to hostnames and write it out to the message file where the sunmc agent can pick it up(and alarm on it). This process seems to be working ok so far.
(depending on database consistency it may require /usr/bin/head after the first | if there are duplicate entries in the database. One of our masters required it...the other did not)
ls -t /var/opt/SUNWsymon/cfg/topology+view* | xargs grep '"0"' | sed -e 's/^.*-\(.*\)).*/\1/' | sort | uniq"
-
I'm in the process of closing this case with sun(if it isnt;t already), but I thought I would reply to this thread one more time with a semi- working solution. There are still a couple of flaws btw..
#!/usr/bin/ksh
set -A varArray `ls -t /var/opt/SUNWsymon/cfg/topology+view* | head -20 | xargs grep '"0"' | sed -e 's/^.*-\(.*\)).
*/\1/' | sort | uniq`;
varMesg="Servers with notifications disabled - ";
if [ ${#varArray[*]} -lt 1 ] ; then
exit status 0
else
[ ${#varArray[*]} -ge 1 ]
i=0
while [ i -lt ${#varArray[*]} ] ; do
varMesg="$varMesg `grep ${varArray[$i]}= /usr/local/scripts/eid_trans | awk -F= '{print$2}'"
((i=i+1))
done
if [ `print $varMesg | wc -m` = 38 ] ; then
exit status 0;
else
#print $varMesg;
logger -p daemon.notice $varMesg -;
fi
fi
exit
eid_trans is in the form of
eid=hostname
eid=hostname
--
new row added to each masters agent scanning the /var/adm/messages file for Servers.*-
