spam
hello i have:
Sun Java(tm) System Messaging Server 6.2-6.01 (built Apr 3 2006)
libimta.so 6.2-6.01 (built 11:20:35, Apr 3 2006)
SunOS mail 5.9 Generic_118558-21 sun4u sparc SUNW,Ultra-60
some users get a LOT more spam than others - is there something that can be done in the JES to increase the sensitivity of the spamfilter on a per user basis?
do you recommed lowering the threshold for everyone ... to lower than 5?
i am using clamav w/ clamsmtpd and libspamass.so integration.
also what is JES 6? is it available for testing? when will the libclamav integration be available in a patch to the public on sunsolve.
thanks,
s7
null
null
[710 byte] By [
starman7] at [2007-11-26 10:35:55]

# 1
First, Messaging Server does no spam scanning. Whatever it is you're using would be where to change settings.I find that some of my users get much more spam than others, but the percentage of what I get vs what comes in the door is pretty consistant. I do train SpamAssassin, though, and I upgrade it pretty routinely, as new versions come out.
The SpamAssassin folks are experimenting with an OCR (Character recognition) module, to combat image-only spams. I've not tried this, myself, yet.
Ah. You're using SA, too. What version? Do you train it? Do you have any additional spam detection modules loaded?
The current version, Messaging 6.2, is also known as "jes4", 2005q4. Jes 5 is due out around the end of this calendar year. It's in beta, now, but the beta is now closed to new testers.
JES6 is future. . .
# 2
thanks jay - i'm using a pretty recent version of SA, i upgraded it maybe 2-weeks ago (along with a new clamav engine). i don't train it, although i believe it is auto-learning by default.
what additional add-ons or tweaks might i try? or do you recommend ...
any idea if JES will support per/user SA attributes (like individual thresholds) in a user's LDAP record?
is there any decent way to do this now?
do you leave the threshold at 5?
i have seen jes6 referenced here recently.
null
null
# 3
Peruser stuff can only be used with Messaging 6 and later, and only in reading the header after being written in response to the SA response to the server. Sending a message through SA for each addressee in a list would cause unacceptable performance.
You might want to hit "rules de jour" website, and get some more rules.
Also, "auto train" isn't something I've actually ever seen SA do. I manually trained it with ham (at least 500 such), and feed it un-detected spam, pretty much daily. That helps.
More rules help, too.
# 4
thanks for the rules du jour info.what does your rules du jour config look like?what rules are the most effective/changing, and benifit from daily updates?initially i included all in trusted_rules but SA ground to a halt.thanks,s7
# 5
I only added a few rules, ones that looked interesting. Some of the rules were for old versions of SA, and are now redundant. Don't use "big evil", as it will cause SA to grind to a halt.Strongly recommend joining the SA newsgroup. Lots of good information there.
# 6
Hi,
Tuning SpamAssassin to not consume too many resources can take a fair bit of time and like all tuning efforts, it should be done slowly rather then just turning on a whole bunch of rules at once.
The secret is to disable all those rules which contribute very little but cost a lot. These can include most of the RBL checks (if you have them enabled).
RBL checks due to their dynamic nature are VERY good at improving Spam matching but they are also expensive (CPU/delay wise) due to the number of DNS checks.
Ensure that you have a local caching name-server daemon running on your system to reduce the DNS lookup impact.
With regards to which rules to include, I can recommend the following :
SARE_HTML0
SARE_HEADER0
SARE_SPECIFIC
SARE_ADULT
SARE_GENLSUBJ0
SARE_URI0
SARE_OBFU0
SARE_STOCKS
You will note that I only include the "0" variant, that is because these are rules that hit a lot of spam and no ham. Try these out and see how it goes.
Regards,
Shane.
# 7
Following on from Jay's comments, the new-groups are a very good resource for new rules/new ideas on spam matching methods etc.
You can find archives of the new-groups at the following:
http://wiki.apache.org/spamassassin/MailingLists
I personally only look at the Users list.
Regards,
Shane.
# 8
Thank you both for your suggestions. Here are my trusted rulesets:
i added yours, and kept the differing ones from my original config (i got from a how-to) - other than tripwire, the differeing ones differ only in their level e.g. uri0, uri1, etc.
TRUSTED_RULESETS="TRIPWIRE SARE_HTML0 SARE_HEADER0 SARE_SPECIFIC SARE_ADULT SARE
_GENLSUBJ0 SARE_URI0 SARE_URI1 SARE_OBFU0 SARE_OBFU1 SARE_STOCKS"
it seems to work pretty well. Shane do you also update regularly with rules_du_jour? My script seems to fail at restarting SA - i get an email with 'command spamassassin not found.' i edited /etc/default/cron
and added:
SUPATH=$PATH:/usr/local/bin
but it doesn't seem to help.
the rules get copied to /etc/mail/spamassassin and /etc/mail/spamassassin/RulesDuJour/
in any event, on the caching dns - do you recommend dnscache from djbdns? would i also need to install tinydns - or is dnscache enough. how will i notice the improvement? i currently tail mail.log_current.
w/ dnscashing, will i notice things scrolling by quicker?
thanks,
s7
# 9
For starting SA, what is used in for most of us, "spamd", not "spamassassin". The "spamassassin" command is used for scanning a single message. "Spamd" is the daemon we connect to.....
# 10
yes - my rulesdujour config file tries to restart it with kill -HUP spamd - but, in addition to the other rules update notification emails, i still also receive an email about rules_du_jour not finding the spamassassin command
i've setup a local dnscache (no tinydns) using djbdns on the mail server
edited my resolv.conf to use only nameserver 127.0.0.1 and restarted the mail servers -
what kind of performance increase should i see? how can i see/quantify the increase ...
would i defnitely see an increase, is it conceivable that my previous, non-local resolvers would be faster?
thanks, s7
# 11
Hi,
> i added yours, and kept the differing ones from my
> original config (i got from a how-to) - other than
> tripwire, the differeing ones differ only in their
> level e.g. uri0, uri1, etc.
I tried out the tripwire ruleset when I was reviewing the various rulesets. I found it to be confusing and prone to false-positives - and fairly expensive to evaluate - not to mention clogging up the spam logs. As always - results may vary.
> Shane do you also
> update regularly with rules_du_jour?
I never trust automated cron based spam updates, too much chance of something going wrong and causing all sorts of mess - one broken ruleset can ruin your day.
The systems I was was running processed ~450K emails per day, so I had to be careful not to introduce a changed ruleset that upset the balance too much. Better to just check for new rulesets every few weeks unless there happens to be a new flood of Spam - in which case I would review the SpamAssassin mailing lists for hints/tips/rule update notices.
I would also run sa-update every few weeks just in case there were a few fixes to the base rulesets. I would always run this in a dev environment first though.
> in any event, on the caching dns - do you recommend
> dnscache from djbdns? would i also need to install
> tinydns - or is dnscache enough. how will i notice
> the improvement? i currently tail mail.log_current.
> w/ dnscashing, will i notice things scrolling by
> quicker?
I actually used bind9 configured to act as a caching proxy only - I haven't used the other ones you referred to.
We were also using rbldnsd to keep a cache of DNS BlackList information from various sources to speed up lookups - which required the use of bind9 to redirect requests to the local caching repository. I wouldn't bother going to this extent unless you are processing > 100K emails / day.
You will not necessarily notice massive improvements. What the DNS caching will help with is floods of email and any slowdown in your DNS servers - plus the admins of the dns systems will thank you for the reduced traffic. Reduce/reuse/recycle - good for the environment, good for DNS lookups, good for your messaging server install ;)
Shane.
# 12
thanks for the info and suggestions - i didn't know there was an sa-update. (needs gpg though). good point about cron jobs which depend on third party sites. not that i got rules_du_jour to work from cron anyway.
as for getting a new flood of spam - it's sort of happening, mostly with images - with little specs in them (like the paper US currency is printed on ...) the specs in the image seem to be randomly animated ... i understand this makes them harder to checksum (and therefore harder for SA to recognize).
again several users are still getting flooded by spam much more than others. other than lowering the threshold for everyone, training spam, and updating rules, is there anything else that can be done for these folks?
btw - dnscache from djbdns is pretty cool - very light weight, easy to install, seemingly optimized/performant code, and supposedly more secure than bind.
s7
# 13
(just happened to find this thread when searching "rules du jour")
For what it's worth, RDJ will never leave you hanging with syntatically broken rulesets as it always runs spamassassin --lint first, and rolls back all changes if that fails. However, it obviously can't stop a inefficient rule from slowing your system down.
Chris T.
# 14
Hi,
Assuming that the RDJ script works correctly when it hits a --lint failure, then you are correct, it shouldn't allow you to use a broken ruleset. As the spam rulesets in question aren't updated that often (maybe once or twice a month at most), then running the script manually is no big inconvenience.
I usually run a few sample ham/spam emails through the new rules to verify that some badly written rule isn't causing performance/score issues.
Regards,
Shane.