Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

spam score not counted correctly

 

 

First page Previous page 1 2 Next page Last page  View All SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


benedict.verheyen at gmail

Oct 8, 2008, 3:03 AM

Post #1 of 30 (455 views)
Permalink
spam score not counted correctly

Hi,

i'm using Debian stable and spamassassin v3.2.3.
Recently i noticed a few spam mails getting through although the
combined scores should be high enough.
The email is however flagged as not being spam, the score is set to 3.9 but
should actually be way higher.
I also encountered something similar when the result of one of the tests
was "nan", anyway, the score was a string instead of a number and that also
resulted in a spam message getting flagged as no spam.

Here is the header report:
X-Spam-Flag: NO
X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on loki.x.y
X-Spam-Level:
X-Spam-Status: "No, score=3.9 required=4.0 tests=FORGED_HOTMAIL_RCVD2,
FORGED_MUA_AOL_FROM,FROM_ILLEGAL_CHARS,MIME_BOUND_DD_DIGITS,MISSING_MIMEOLE,
RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_NJABL_RELAY,SPF_SOFTFAIL,
SUBJECT_NEEDS_ENCODING,SUBJ_ILLEGAL_CHARS,UNPARSEABLE_RELAY autolearn=no
version=3.2.3
X-Spam-Report: * 4.2 MIME_BOUND_DD_DIGITS Spam tool pattern in MIME boundary
* 4.0 FROM_ILLEGAL_CHARS Van: bevat te veel 'raw' tekens
* 0.7 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail)
* 1.5 SUBJ_ILLEGAL_CHARS Onderwerp: bevat te veel 'raw' tekens
* 1.1 FORGED_HOTMAIL_RCVD2 hotmail.com 'Van' adres, maar geen 'Received:'
* 0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines
* 1.8 RCVD_IN_NJABL_RELAY RBL: NJABL: verzender is een bevestigde open
* relay
* [58.211.230.39 listed in combined.njabl.org]
* 4.0 RCVD_IN_BL_SPAMCOP_NET RBL: Ontvangen via een relay die gevonden is
* in bl.spamcop.net
* [Blocked - see <http://www.spamcop.net/bl.shtml?58.10.84.108>]
* 1.3 SUBJECT_NEEDS_ENCODING SUBJECT_NEEDS_ENCODING
* 1.3 FORGED_MUA_AOL_FROM Vals mailtje, pretendeert afkomstig te zijn van
* AOL (middels From)
* 0.0 MISSING_MIMEOLE Bericht heeft een X-MSMail-Priority, maar geen
* X-MimeOLE


I ran the message through spamassassin again with the -D flag and this
is what i got. Notice the nan score now. Maybe it's that score again
that is the reason why counting the scores didn't work?

[32465] dbg: learn: auto-learn: currently using scoreset 3, recomputing score based on scoreset 1
[32465] dbg: learn: auto-learn: message score: nan, computed score for autolearn: 18.646
[32465] dbg: learn: auto-learn? ham=0.1, spam=12, body-points=18.646, head-points=18.646, learned-points=3.5
[32465] dbg: learn: auto-learn? no: scored as ham but autolearn wanted spam
[32465] dbg: check: is spam? score=nan required=4
...
X-Spam-Flag: NO
X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on loki.x.y
X-Spam-Level:
X-Spam-Status: "No, score=3.9 required=4.0 tests=BAYES_99,
DNS_FROM_SECURITYSAGE,FORGED_HOTMAIL_RCVD2,FORGED_MUA_AOL_FROM,
FROM_ILLEGAL_CHARS,MIME_BOUND_DD_DIGITS,MISSING_MIMEOLE,RCVD_IN_NJABL_RELAY,
SPF_SOFTFAIL,SUBJECT_NEEDS_ENCODING,SUBJ_ILLEGAL_CHARS,UNPARSEABLE_RELAY
autolearn=no version=3.2.3
X-Spam-Report:
* 3.5 BAYES_99 BODY: Bayesiaanse kans op spam is 99 tot 100%
* [score: 1.0000]
* 1.5 MIME_BOUND_DD_DIGITS Spam tool pattern in MIME boundary
* nan FROM_ILLEGAL_CHARS Van: bevat te veel 'raw' tekens
* 0.6 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail)
* 1.6 SUBJ_ILLEGAL_CHARS Onderwerp: bevat te veel 'raw' tekens
* 1.5 FORGED_HOTMAIL_RCVD2 hotmail.com 'Van' adres, maar geen 'Received:'
* 1.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines
* 2.7 RCVD_IN_NJABL_RELAY RBL: NJABL: verzender is een bevestigde open
* relay
* [58.211.230.39 listed in combined.njabl.org]
* 2.5 DNS_FROM_SECURITYSAGE RBL: Envelope sender in
* blackholes.securitysage.com
* 0.5 SUBJECT_NEEDS_ENCODING SUBJECT_NEEDS_ENCODING
* 3.3 FORGED_MUA_AOL_FROM Vals mailtje, pretendeert afkomstig te zijn van
* AOL (middels From)
* 1.0 MISSING_MIMEOLE Bericht heeft een X-MSMail-Priority, maar geen
* X-MimeOLE

Why is the score only at 3.9 and thus not flagged as spam?

Thanks,
Benedict


benedict.verheyen at gmail

Oct 10, 2008, 4:20 AM

Post #2 of 30 (426 views)
Permalink
Re: spam score not counted correctly [In reply to]

Benedict Verheyen wrote:
<snip>

I want to reply to my previous message with some more info but i'm not
able to do so, my messages keep getting flagged as spam.
Very annoying, first spamassassin doesn't work like it should here and i
can't even ask for help now :)
Who do i contact to solve the issue of mailing to the list?
I would hopefully then be able to post a message to the list again.

Benedict


mkettler_sa at verizon

Oct 10, 2008, 4:42 AM

Post #3 of 30 (425 views)
Permalink
Re: spam score not counted correctly [In reply to]

Benedict Verheyen wrote:
> Benedict Verheyen wrote:
> <snip>
>
> I want to reply to my previous message with some more info but i'm not
> able to do so, my messages keep getting flagged as spam.
> Very annoying, first spamassassin doesn't work like it should here and i
> can't even ask for help now :)
> Who do i contact to solve the issue of mailing to the list?
> I would hopefully then be able to post a message to the list again.
>
>

You're certainly not the first one to ask, but that's unfortunately
unchangeable. SpamAssassin is one of many different projects on apache's
mail servers, and they all get the same treatment.

If you need to post a spam sample, use pastebin or a similar web-based
storage and email a link to it.


benedict.verheyen at telenet

Oct 10, 2008, 4:48 AM

Post #4 of 30 (425 views)
Permalink
Re: spam score not counted correctly [In reply to]

Benedict Verheyen wrote:
> Benedict Verheyen wrote:
> <snip>
>
> I want to reply to my previous message with some more info but i'm not
> able to do so, my messages keep getting flagged as spam.
> Very annoying, first spamassassin doesn't work like it should here and i
> can't even ask for help now :)
> Who do i contact to solve the issue of mailing to the list?
> I would hopefully then be able to post a message to the list again.

Hi,


Hopefully this message arrives correctly as it was first marked as spam
apparently.

I reinstalled spamassassin (from backports, version 3.2.5-1) and tried
spamassassin -D again, with another spam message but the result is the
same. Again a "nan" score and a clear spam message not being
marked as spam.

I found bug # 3364 in the buglist and according to this it seems like a
Debian issue. It doesn't seem to occur on other systems or at least it's
not reproducable.

The uri bl black is scored as nan again.
It's really annoying as this is what probably is causing the score not
to be counted correctly.

I tried spamassassin in debug mode with this message a few times and
suddenly it was flagged as spam. Maybe some custom tests that are added
to Debian are responsible for this or maybe when a test fails it puts
"nan" as score?

If can also try to add a check in my mailfilter file to test for a nan
score and move such messages to the spam directory.

Regards,
Benedict


Mark.Martinec+sa at ijs

Oct 10, 2008, 7:45 AM

Post #5 of 30 (422 views)
Permalink
Re: spam score not counted correctly [In reply to]

Benedict,

> I found bug # 3364 in the buglist and according to this it seems like a
> Debian issue. It doesn't seem to occur on other systems or at least it's
> not reproducable.
>
> The uri bl black is scored as nan again.
> It's really annoying as this is what probably is causing the score not
> to be counted correctly.


Please try the following patch (to 3.2.5).
It should produce a warning on stderr when some plugin
would attempt to add a NaN to score:

--- lib/Mail/SpamAssassin/PerMsgStatus.pm (revision 703484)
+++ lib/Mail/SpamAssassin/PerMsgStatus.pm (working copy)
@@ -2141,6 +2138,12 @@
return;
}

+ # this should not happen; warn about NaN
+ if ($score != $score) {
+ warn "rules: score '$score' for rule '$rule' in '$area' '$desc'";
+ return;
+ }
+
# Add the rule hit to the score
$self->{score} += $score;



Mark


guenther at rudersport

Oct 10, 2008, 8:17 AM

Post #6 of 30 (422 views)
Permalink
Re: spam score not counted correctly [In reply to]

On Fri, 2008-10-10 at 16:45 +0200, Mark Martinec wrote:
> Benedict,
>
> > I found bug # 3364 in the buglist and according to this it seems like a
> > Debian issue. It doesn't seem to occur on other systems or at least it's
> > not reproducable.

Do you use customized headers? (Sorry, don't have the OP, but IIRC I
spotted some.) What are the results of the snippets in comment 4, and
what about comment 11?


> > The uri bl black is scored as nan again.
> > It's really annoying as this is what probably is causing the score not
> > to be counted correctly.
>
>
> Please try the following patch (to 3.2.5).
> It should produce a warning on stderr when some plugin
> would attempt to add a NaN to score:
>
> --- lib/Mail/SpamAssassin/PerMsgStatus.pm (revision 703484)
> +++ lib/Mail/SpamAssassin/PerMsgStatus.pm (working copy)
> @@ -2141,6 +2138,12 @@
> return;
> }
>
> + # this should not happen; warn about NaN
> + if ($score != $score) {
> + warn "rules: score '$score' for rule '$rule' in '$area' '$desc'";
> + return;
> + }
> +
> # Add the rule hit to the score
> $self->{score} += $score;

Nice. :)

Just thinking out loud, if Benedict would add another line in there, to
force the NaN score to be 0.01, he also could *temporarily* work around
the issue. Mail exceeding the spam threshold then should not slip by
uncaught. Of course, identifying the offending mail for later
investigation would be slightly harder, but less obtrusive for the user,
if he gets a lot of these.

Puzzling, how he gets NaN in the first place. Benedict, did you lint
your rc files? Did you carefully check all score definitions (possibly
including user_prefs) for that rule? (Like using a wrong decimal
separator, or some invisible stray chars.)

guenther


--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Mark.Martinec+sa at ijs

Oct 10, 2008, 10:02 AM

Post #7 of 30 (419 views)
Permalink
Re: spam score not counted correctly [In reply to]

Guenther wrote:
> Do you use customized headers? (Sorry, don't have the OP, but IIRC I
> spotted some.) What are the results of the snippets in comment 4, and
> what about comment 11?

A question is for Benedict I suppose.

> Puzzling, how he gets NaN in the first place. Benedict, did you lint
> your rc files? Did you carefully check all score definitions (possibly
> including user_prefs) for that rule? (Like using a wrong decimal
> separator, or some invisible stray chars.)

That's puzzling for me too. I far as I can tell, a NaN can only happen
as a result of floating point arithmetics (like: (-3)**0.5 ), or when
directly specified in Perl code, e.g. $a = NaN; I don't think is can
result from simple string conversions and the like.

Catching which rule or pluging is trying to add it would help narrowing
down the cause. I hope all score additions go through the now instrumented
subroutine, otherwise someone more knowledgable in SA internals may
indicate what additional code paths should add a test for a NaN.

Mark


benedict.verheyen at telenet

Oct 13, 2008, 2:35 AM

Post #8 of 30 (382 views)
Permalink
Re: spam score not counted correctly [In reply to]

Guenther wrote:
>> Do you use customized headers? (Sorry, don't have the OP, but IIRC I
>> spotted some.) What are the results of the snippets in comment 4, and
>> what about comment 11?
>>
>
> A question is for Benedict I suppose.
>
>
>> Puzzling, how he gets NaN in the first place. Benedict, did you lint
>> your rc files? Did you carefully check all score definitions (possibly
>> including user_prefs) for that rule? (Like using a wrong decimal
>> separator, or some invisible stray chars.)
>>
>
> That's puzzling for me too. I far as I can tell, a NaN can only happen
> as a result of floating point arithmetics (like: (-3)**0.5 ), or when
> directly specified in Perl code, e.g. $a = NaN; I don't think is can
> result from simple string conversions and the like.
>
> Catching which rule or pluging is trying to add it would help narrowing
> down the cause. I hope all score additions go through the now instrumented
> subroutine, otherwise someone more knowledgable in SA internals may
> indicate what additional code paths should add a test for a NaN.
>
> Mark
>

Hi,

thanks Mark and Guenther.

I patched the score part as indicated in Mark's mail and when i run
spamassassin in
debug mode, i do see a message popping up with results to a NaN score:
[6443] warn: !!!!!!!! rules: score 'nan' for rule 'AWL' in 'AWL: '
'From: address is in the auto white-list' at /usr/share/perl5/Mail/
SpamAssassin/PerMsgStatus.pm line 2146.

The message is now correctly marked as spam and no nan reference
is printed in the spam report (in debug mode that is)

When i run spamassassin with the lint option, no errors pop up, only
this message:
warn: Character in 'C' format wrapped in pack at
/usr/share/perl5/Mail/SpamAssassin/Util.pm line 800.
If i edit Util.pm and change the sub my_inet_aton function (more
specific C4 to U4) i
don't get that warning. But testing revealed that it doesn't influence
the scoring/nan.

As for custom rules, i reinstalled spamassassin (with purge option in
debian it should be
removing all dirs as well) and i'm not using any custom rules in my
/etc/spamassassin/local.cf.
I do however alter scores (started because of the nan's influencing the
scores) in my local.cf
Is there a way to check if i have custom rules being loaded?

Thanks,
Benedict


benedict.verheyen at gmail

Oct 13, 2008, 6:07 AM

Post #9 of 30 (381 views)
Permalink
Re: spam score not counted correctly [In reply to]

Benedict Verheyen wrote:
<snip>

> Hi,
>
> thanks Mark and Guenther.
>
> I patched the score part as indicated in Mark's mail and when i run
> spamassassin in
> debug mode, i do see a message popping up with results to a NaN score:
> [6443] warn: !!!!!!!! rules: score 'nan' for rule 'AWL' in 'AWL: '
> 'From: address is in the auto white-list' at /usr/share/perl5/Mail/
> SpamAssassin/PerMsgStatus.pm line 2146.
>
> The message is now correctly marked as spam and no nan reference
> is printed in the spam report (in debug mode that is)
>
> When i run spamassassin with the lint option, no errors pop up, only
> this message:
> warn: Character in 'C' format wrapped in pack at
> /usr/share/perl5/Mail/SpamAssassin/Util.pm line 800.
> If i edit Util.pm and change the sub my_inet_aton function (more
> specific C4 to U4) i
> don't get that warning. But testing revealed that it doesn't influence
> the scoring/nan.
>
> As for custom rules, i reinstalled spamassassin (with purge option in
> debian it should be
> removing all dirs as well) and i'm not using any custom rules in my
> /etc/spamassassin/local.cf.
> I do however alter scores (started because of the nan's influencing the
> scores) in my local.cf
> Is there a way to check if i have custom rules being loaded?
>
> Thanks,
> Benedict

I got a message that again scored a nan for MSOE_MID_WRONG_CASE
The mail is available here:
http://paste-it.net/public/r3df8b2/

Weird thing is that the lines i added to PerMsgStatus.pm weren't showing up.

Regards,
Benedict


benedict.verheyen at gmail

Oct 13, 2008, 8:20 AM

Post #10 of 30 (380 views)
Permalink
Re: spam score not counted correctly [In reply to]

Benedict Verheyen wrote:
<snip>
>
> I got a message that again scored a nan for MSOE_MID_WRONG_CASE
> The mail is available here:
> http://paste-it.net/public/r3df8b2/
>
> Weird thing is that the lines i added to PerMsgStatus.pm weren't showing up.
>
> Regards,
> Benedict

My bad, i reinstalled SpamAssassin again and the adjustment in the code
was gone. So i entered the code of Mark again and when i then run it
through the debugger, it is flagged as spam.

Thing is, what is causing the nan?

Regards,
Benedict


Mark.Martinec+sa at ijs

Oct 13, 2008, 8:39 AM

Post #11 of 30 (378 views)
Permalink
Re: spam score not counted correctly [In reply to]

Benedict,

> Thing is, what is causing the nan?

My guess is that a NaN somehow got into your AWL database.

I have reopened bug 3364, and attached a richer patch:
"Deal with NaN in AutoWhitelist and PerMsgStatus"
which includes my previous patch and also instruments
AutoWhitelist module to check for NaN on data entering
or leaving a database. It should produce warnings and
ignore such data. Please try it.

Mark


linux4bene at telenet

Oct 13, 2008, 11:05 AM

Post #12 of 30 (365 views)
Permalink
Re: spam score not counted correctly [In reply to]

Mark Martinec schreef:
> Benedict,
>
>
>> Thing is, what is causing the nan?
>>
>
> My guess is that a NaN somehow got into your AWL database.
>
> I have reopened bug 3364, and attached a richer patch:
> "Deal with NaN in AutoWhitelist and PerMsgStatus"
> which includes my previous patch and also instruments
> AutoWhitelist module to check for NaN on data entering
> or leaving a database. It should produce warnings and
> ignore such data. Please try it.
>
> Mark
>

Mark,

i applied the next patch and ran a new debug session.
This is part of the result.
In the logging there is reference to a NaN score that is being ignored
now so that is good.
A few lines later it calculates a new total score (totscore) however
that is "nan"
dbg: auto-whitelist: add_score: new count: 6, new totscore: nan
The endresult is good however as it's flagged as spam.
This is the excerpt from the debug test.

[1467] warn: !!!!!! rules: score 'nan' for rule 'MSOE_MID_WRONG_CASE' in
'' 'MSOE_MID_WRONG_CASE' at
/usr/share/perl5/Mail/SpamAssassin/PerMsgStatus.pm line
2147.
[1467] dbg: check: running tests for priority: 1000
[1467] dbg: rules: running head tests; score so far=7.116
[1467] dbg: rules: compiled head tests
[1467] dbg: config: using "/root/.spamassassin" for user state dir
[1467] dbg: locker: safe_lock: created
/root/.spamassassin/auto-whitelist.mutex
[1467] dbg: locker: safe_lock: trying to get lock on
/root/.spamassassin/auto-whitelist with 30 timeout
[1467] dbg: locker: safe_lock: link to
/root/.spamassassin/auto-whitelist.mutex: link ok
[1467] dbg: auto-whitelist: tie-ing to DB file of type DB_File R/W in
/root/.spamassassin/auto-whitelist
[1467] dbg: auto-whitelist: db-based edwinkldaniels[at]gmail.com|ip=84.126
scores 5/nan
[1467] warn: auto-whitelist: totscore for (edwinkldaniels[at]gmail.com,
84.126.65.162) is a NaN, ignored
[1467] dbg: auto-whitelist: AWL active, pre-score: 7.116, autolearn
score: 7.116, mean: undef, IP: 84.126.65.162
[1467] dbg: auto-whitelist: add_score: new count: 6, new totscore: nan
[1467] dbg: auto-whitelist: DB addr list: untie-ing and unlocking
[1467] dbg: auto-whitelist: DB addr list: file locked, breaking lock
[1467] dbg: locker: safe_unlock: unlocked
/root/.spamassassin/auto-whitelist.mutex
[1467] dbg: auto-whitelist: post auto-whitelist score: 7.116
[1467] dbg: rules: running body tests; score so far=7.116
[1467] dbg: rules: compiled body tests
[1467] dbg: rules: running uri tests; score so far=7.116
[1467] dbg: rules: compiled uri tests
[1467] dbg: rules: running rawbody tests; score so far=7.116
[1467] dbg: rules: compiled rawbody tests
[1467] dbg: rules: running full tests; score so far=7.116
[1467] dbg: rules: compiled full tests
[1467] dbg: rules: running meta tests; score so far=7.116
[1467] dbg: rules: compiled meta tests
[1467] dbg: plugin:
Mail::SpamAssassin::Plugin::AutoLearnThreshold=HASH(0x910f440)
implements 'autolearn_discriminator', priority 0
[1467] dbg: learn: auto-learn: currently using scoreset 3, recomputing
score based on scoreset 1
[1467] dbg: learn: auto-learn: message score: 7.116, computed score for
autolearn: 7.199
[1467] dbg: learn: auto-learn? ham=0.1, spam=12, body-points=7.199,
head-points=7.199, learned-points=1
[1467] dbg: learn: auto-learn? no: inside auto-learn thresholds, not
considered ham or spam
[1467] dbg: check: is spam? score=7.116 required=4
[1467] dbg: check: tests=ADVANCE_FEE_2,BAYES_60,FORGED_MUA_OUTLOOK
[1467] dbg: check:
subtests=__ANY_OUTLOOK_MUA,__CT,__CTE,__CTYPE_CHARSET_QUOTED,__CT_TEXT_PLAIN,__DOS_HAS_ANY_URI,__DOS_RCVD_MON,__DOS_RELAYED_EXT,
__ENV_AND_HDR_FROM_MATCH,__FH_HAS_XMSMAIL,__FH_HAS_XPRIORITY,__FORGED_OE,__FRAUD_DBI,__FRAUD_IOU,__FRAUD_MCQ,__HAS_ANY_EMAIL,__HAS_ANY_URI,__HAS_MIMEOLE,
__HAS_MSGID,__HAS_MSMAIL_PRI,__HAS_RCVD,__HAS_SUBJECT,__HAS_X_MAILER,__KAM_LOTTO3,__LAST_UNTRUSTED_RELAY_NO_AUTH,__MIMEOLE_MS,__MIME_VERSION,__MISSING_REF,
__MSGID_OK_DIGITS,__MSOE_MID_WRONG_CASE,__NONEMPTY_BODY,__NO_INR_YES_REF,__OE_MUA,__SANE_MSGID,__TOCC_EXISTS,__TVD_BODY,__TVD_MIME_ATT_TP,__XM_MSOE6,
__XM_MS_IN_GENERAL,__XM_OUTLOOK_EXPRESS
Return-Path: <edwinkldaniels[at]gmail.com>
X-Spam-Flag: YES
....

Thanks,
Benedict


guenther at rudersport

Oct 13, 2008, 1:16 PM

Post #13 of 30 (376 views)
Permalink
Re: spam score not counted correctly [In reply to]

On Mon, 2008-10-13 at 17:39 +0200, Mark Martinec wrote:

> > Thing is, what is causing the nan?
>
> My guess is that a NaN somehow got into your AWL database.

Things are much more complicated, or rather weird, than that.

According to Benedict's reports and pasted snippets, he got an NaN score
for at least 3 rules: FROM_ILLEGAL_CHARS, AWL, MSOE_MID_WRONG_CASE

Also, it appears that he actually is using custom headers. The initial
snippets showed a stray " char at the beginning of X-Spam-Status. The
last paste does not show that header at all.

It might be noteworthy, that X-Spam-Flag and the verbose X-Spam-Report
are either added unconditionally, or this is just a side effect when
comparing to the spam threshold (not exceeding the threshold, thus "no",
but yet not below the threshold, thus not excluded).


Benedict, since I asked about custom headers before, it might be a good
idea to carefully check the config and answer my previous question.
Since you're not using custom rules, but change scores, you likely
copied (read: inherited) that part from your previous cf files. How
carefully did you check this and *any* customization?

Also, please do check the config for invisible, stray chars, as I
pointed out before. This includes user_prefs, if any. At that note: Does
this affect only a single user (if using per user settings)?


A wild guess: Since the affected rule/score varies wildly, might the
culprit by any chance be bad RAM?

guenther


--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Mark.Martinec+sa at ijs

Oct 13, 2008, 5:03 PM

Post #14 of 30 (375 views)
Permalink
Re: spam score not counted correctly [In reply to]

Guenther, Benedict,

> > My guess is that a NaN somehow got into your AWL database.
>
> Things are much more complicated, or rather weird, than that.
>
> According to Benedict's reports and pasted snippets, he got an NaN score
> for at least 3 rules: FROM_ILLEGAL_CHARS, AWL, MSOE_MID_WRONG_CASE

You are quite right, I wasn't paying attention. The NaN from AWL is
more likely than not just a side effect of a NaN produced previously
by other rules.

With some luck, the patch (my first or the second) could also trigger
a warning on a primary cause when it occurs, not just on AWL.

Mark


benedict.verheyen at telenet

Oct 14, 2008, 12:03 AM

Post #15 of 30 (372 views)
Permalink
Re: spam score not counted correctly [In reply to]

Karsten Bräckelmann schreef:
> Benedict, since I asked about custom headers before, it might be a good
> idea to carefully check the config and answer my previous question.
> Since you're not using custom rules, but change scores, you likely
> copied (read: inherited) that part from your previous cf files. How
> carefully did you check this and *any* customization?
>
Hi,

as i said, to my knowledge, i'm not using any custom headers and i asked
how i could know for sure
as it's not clear to me how to check if i'm using customer headers. As i
said before, that's the reason
why i reinstalled SpamAssassin, to make sure i don't have any left overs
from earlier configs.
I posted my config here: http://www.heimdallit.be/download/local.cf

> Also, please do check the config for invisible, stray chars, as I
> pointed out before. This includes user_prefs, if any. At that note: Does
> this affect only a single user (if using per user settings)?

I'm the only one actively using the machine.
> A wild guess: Since the affected rule/score varies wildly, might the
> culprit by any chance be bad RAM?
>
Bad ram? I seriously doubt but i could test it with a live cd that has
the memtest program.

Regards,
Benedict


benedict.verheyen at gmail

Oct 14, 2008, 2:20 AM

Post #16 of 30 (373 views)
Permalink
Re: spam score not counted correctly [In reply to]

Mark Martinec wrote:
> Guenther, Benedict,
>
>>> My guess is that a NaN somehow got into your AWL database.
>> Things are much more complicated, or rather weird, than that.
>>
>> According to Benedict's reports and pasted snippets, he got an NaN score
>> for at least 3 rules: FROM_ILLEGAL_CHARS, AWL, MSOE_MID_WRONG_CASE
>
> You are quite right, I wasn't paying attention. The NaN from AWL is
> more likely than not just a side effect of a NaN produced previously
> by other rules.
>
> With some luck, the patch (my first or the second) could also trigger
> a warning on a primary cause when it occurs, not just on AWL.
>
> Mark
>

Hi,

i have tested with another spam message that has a combined score of
22.5 and it's not flagged as spam.
The full debug log is here:
http://www.heimdallit.be/download/spam_debug_1.txt

A few weird things about that debug session:

1. AWL is spitting out a nan.

[2385] dbg: auto-whitelist: AWL active, pre-score: 22.5, autolearn
score: nan, mean: 11.5, IP: 201.173.156.158
[2385] warn: auto-whitelist: attempt to add a nan to AWL entry ignored
[2385] warn: !!!!!!!!! rules: score 'nan' for rule 'AWL' in 'AWL: '
'From: address is in the auto white-list' at
/usr/share/perl5/Mail/SpamAssassin/PerMsgStatus.pm line 2146.

2. Bayes doesn't autolearn from a message that scores 22.5. It says it
wants spam but that the message was ham...

[2385] dbg: learn: auto-learn: currently using scoreset 3, recomputing
score based on scoreset 1
[2385] dbg: learn: auto-learn: message score: 22.5, computed score for
autolearn: 18.5
[2385] dbg: learn: auto-learn? ham=0.1, spam=12, body-points=18.5,
head-points=18.5, learned-points=4
[2385] dbg: learn: auto-learn? no: scored as ham but autolearn wanted spam


Questions:

1. Can i reset the AWL database? I do have users listed on the whitelist
in /etc/spamassassin/local.cf (whitelist_from)
2. How do i find out if i have custom headers?
As i said, it's a new install of SpamAssassin so i haven't got a clue
where the custom (if they are used) headers are coming from

Anyway, thanks for the time you already put in. I appreciate it.

Regards,
Benedict


benedict.verheyen at gmail

Oct 14, 2008, 2:20 AM

Post #17 of 30 (372 views)
Permalink
Re: spam score not counted correctly [In reply to]

Benedict Verheyen wrote:
<snip>

Some more interesting stuff from /var/log/syslog:

Oct 14 09:15:08 loki spamd[1274]: auto-whitelist: attempt to add a nan
to AWL entry ignored
177 Oct 14 09:15:08 loki spamd[1274]: !!!!!! rules: score 'nan' for rule
'AWL' in 'AWL: ' 'From: address is in the auto white-list' at /usr/
share/perl5/Mail/SpamAssassin/PerMsgStatus.pm line 2147.

Oct 14 09:16:02 loki spamd[2256]: !!!!!! rules: score 'nan' for rule
'MISSING_SUBJECT' in '' 'Missing Subject: header' at /usr/share/perl5/
Mail/SpamAssassin/PerMsgStatus.pm line 2147.

5 Oct 14 10:10:23 loki spamd[1321]: plugin: eval failed: Sort subroutine
didn't return a numeric value at /usr/share/perl5/Mail/SpamAssassin/
AsyncLoop.pm line 278.

The 2 above where seen already on the debug runs but the last one wasn't.

It has to do with this:

sub log_lookups_timing {
my ($self) = @_;
my $timings = $self->{timing_by_query};
for my $key (sort { $timings->{$a} <=> $timings->{$b} } keys %$timings) {
dbg("async: timing: %.3f %s", $timings->{$key}, $key);
}
}

I run several uml's so could it be that the timing is causing nan's to
appear ?

Regards,
Benedict


guenther at rudersport

Oct 14, 2008, 6:32 AM

Post #18 of 30 (363 views)
Permalink
Re: spam score not counted correctly [In reply to]

On Tue, 2008-10-14 at 09:03 +0200, Benedict Verheyen wrote:
> Karsten Bräckelmann schreef:
> > Benedict, since I asked about custom headers before, it might be a good
> > idea to carefully check the config and answer my previous question.
> > Since you're not using custom rules, but change scores, you likely
> > copied (read: inherited) that part from your previous cf files. How
> > carefully did you check this and *any* customization?
>
> as i said, to my knowledge, i'm not using any custom headers and i
> asked how i could know for sure as it's not clear to me how to check

Ah, sorry, kind of forgot about that. Well, posting your cf files is one
option. ;) Another one is to read the configuration and check back with
the docs. [1] In particular, see Basic Message tagging Options.

Anyway, just as I suspected in a previous post, your custom headers from
local.cf:
clear_headers
add_header all Flag _YESNOCAPS_
add_header all Report _REPORT_

This might be relevant WRT to bug 3364 [2], it definitely matches the
summary. Can you still reproduce these NaN scores, if you comment out
the above options?


> if i'm using customer headers. As i said before, that's the reason why
> i reinstalled SpamAssassin, to make sure i don't have any left overs
> from earlier configs. I posted my config here:
> http://www.heimdallit.be/download/local.cf

FWIW, your custom scores are "left overs from earlier configs".

I assume the first blob of score adjustments are the (previously set)
custom scores for which you've seen NaN results? That's quite a lot and
appears to affect rules randomly.


> > A wild guess: Since the affected rule/score varies wildly, might the
> > culprit by any chance be bad RAM?
> >
> Bad ram? I seriously doubt but i could test it with a live cd that has
> the memtest program.

I'd check that, yeah. Rule's scores are set to NaN randomly and
widespread. Plus the other kind of scary warnings and issues you've
mentioned in this thread...

guenther


[1] http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html
[2] https://issues.apache.org/SpamAssassin/show_bug.cgi?id=3364

--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


guenther at rudersport

Oct 14, 2008, 7:01 AM

Post #19 of 30 (363 views)
Permalink
Re: spam score not counted correctly [In reply to]

On Tue, 2008-10-14 at 10:04 +0200, Benedict Verheyen wrote:
> i have tested with another spam message that has a combined score of
> 22.5 and it's not flagged as spam.
> The full debug log is here:
> http://www.heimdallit.be/download/spam_debug_1.txt

Hmm, does that say that a bunch of major RBLs is timing out?


> 2. Bayes doesn't autolearn from a message that scores 22.5. It says it
> wants spam but that the message was ham...

Without reading the code, this might be because NaN doesn't exceed the
spam threshold.

> [2385] dbg: learn: auto-learn: currently using scoreset 3, recomputing
> score based on scoreset 1
> [2385] dbg: learn: auto-learn: message score: 22.5, computed score for
> autolearn: 18.5
> [2385] dbg: learn: auto-learn? ham=0.1, spam=12, body-points=18.5,
> head-points=18.5, learned-points=4
> [2385] dbg: learn: auto-learn? no: scored as ham but autolearn wanted spam

Huh. body-points == head-points? Also, shouldn't the sum generally be in
the range of total score or less, rather than almost twice?


> Questions:
>
> 1. Can i reset the AWL database?

You can reset the AWL by simply removing the file.

> I do have users listed on the whitelist in /etc/spamassassin/local.cf
> (whitelist_from)

No, you don't. Not in the local.cf you pasted somewhere else in this
thread. (Also, FWIW, AWL is not related to whitelist_* options.)


Syslog output from a follow-up post:

> Oct 14 10:10:23 loki spamd[1321]: plugin: eval failed: Sort subroutine
> didn't return a numeric value at /usr/share/perl5/Mail/SpamAssassin/
> AsyncLoop.pm line 278.

This seems *really* odd. Clearly, something's broken.

> I run several uml's so could it be that the timing is causing nan's
> to
> appear ?

UML?

guenther


--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


benedict.verheyen at gmail

Oct 14, 2008, 7:19 AM

Post #20 of 30 (363 views)
Permalink
Re: spam score not counted correctly [In reply to]

Karsten Bräckelmann wrote:
> On Tue, 2008-10-14 at 10:04 +0200, Benedict Verheyen wrote:
>
>> i have tested with another spam message that has a combined score of
>> 22.5 and it's not flagged as spam.
>> The full debug log is here:
>> http://www.heimdallit.be/download/spam_debug_1.txt
>>
> Hmm, does that say that a bunch of major RBLs is timing out?
>
First, UML is a virtual machine infrastructure. See
http://user-mode-linux.sourceforge.net/
The timing out is caused because it because when i allow the testing it
apparently doesn't work.
When i specify that dns is available in local.cf, then it works as it
should.
>> score based on scoreset 1
>> [2385] dbg: learn: auto-learn: message score: 22.5, computed score for
>> autolearn: 18.5
>> [2385] dbg: learn: auto-learn? ham=0.1, spam=12, body-points=18.5,
>> head-points=18.5, learned-points=4
>> [2385] dbg: learn: auto-learn? no: scored as ham but autolearn wanted spam
>>
> [2385] dbg: learn: auto-learn: currently using scoreset 3, recomputing
> Huh. body-points == head-points? Also, shouldn't the sum generally be in
> the range of total score or less, rather than almost twice?
>
>
No idea, i thought it was strange but i can't reproduce it in my latest
tests with the same message.
> I do have users listed on the whitelist in /etc/spamassassin/local.cf
>> (whitelist_from)
>>
>
> No, you don't. Not in the local.cf you pasted somewhere else in this
> thread. (Also, FWIW, AWL is not related to whitelist_* options.)
I do have them but i removed them from the local.cf i posted because not
everybody's email
is known publicly.

>> Oct 14 10:10:23 loki spamd[1321]: plugin: eval failed: Sort subroutine
>> didn't return a numeric value at /usr/share/perl5/Mail/SpamAssassin/
>> AsyncLoop.pm line 278.
>>
>
> This seems *really* odd. Clearly, something's broken.
>
Yup very strange indeed. After rebooting it doesn't seem to occur anymore.
The only thing that occurs regarding nan is this:

Oct 14 15:15:13 loki spamd[1393]: auto-whitelist: totscore for
(benedict.verheyen[at]telenet.be,
78.22.1.208) is a NaN, ignored

This again suggests that something is broken with my AWL. I think i'd
better delete it.

As it seems now, the only thing strange left is the AWL & related NaN.

Regards,
Benedict


guenther at rudersport

Oct 14, 2008, 7:25 AM

Post #21 of 30 (363 views)
Permalink
Re: spam score not counted correctly [In reply to]

On Tue, 2008-10-14 at 16:00 +0200, Benedict Verheyen wrote:
> Karsten Bräckelmann wrote:

> > This might be relevant WRT to bug 3364 [2], it definitely matches the
> > summary. Can you still reproduce these NaN scores, if you comment out
> > the above options?

> As for reproducing, see last part of this message.

> >>> A wild guess: Since the affected rule/score varies wildly, might the
> >>> culprit by any chance be bad RAM?
> >>>
> >> Bad ram? I seriously doubt but i could test it with a live cd that has
> >> the memtest program.
> >
> > I'd check that, yeah. Rule's scores are set to NaN randomly and
> > widespread. Plus the other kind of scary warnings and issues you've
> > mentioned in this thread...
>
> Yup the behaviour seems inconsistent.
>
> Anyway, i've done 2 more tests. First i retried and now the message is
> flagged as spam but the MSOE_MID_WRONG_CASE part isn't in there anymore
> and thus no Nan.

Hold on. That the very same message as you checked before? How the...
can a static rule like MSOE_MID_WRONG_CASE vanish from the result? It
should be 100% consistent reproducible. If something like this matches
randomly, this is another hint for data in memory to be changing.

Hmm, any chance there are multiple (old?) SA installations rotting on
your system?

As for the message being flagged as spam, that doesn't appear to be much
of a surprise. With Mark's patch in place, any occurrence of NaN should
not invalidate the total score.


> When i comment the custom headers and try again, it's also flagged as
> spam and MSOE_MID_WRONG_CASE is also not in there anymore.

That's not reproducing. :)

The question is not about a single run, but if you get the NaN warning
for any rule, with any message. Please keep the header options commented
out (restart spamd, if you use it) and then watch your results. Do you
still get NaN warnings for rules?

No, we can't proof the absence. Based on the frequency this used to
happen, it might take a while to be confident that the issue doesn't
occur without custom headers.

guenther -- this is a nick only ;)


--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


guenther at rudersport

Oct 14, 2008, 7:43 AM

Post #22 of 30 (363 views)
Permalink
Re: spam score not counted correctly [In reply to]

On Tue, 2008-10-14 at 16:19 +0200, Benedict Verheyen wrote:
> Karsten Bräckelmann wrote:
> > On Tue, 2008-10-14 at 10:04 +0200, Benedict Verheyen wrote:
> >
> >> i have tested with another spam message that has a combined score of
> >> 22.5 and it's not flagged as spam.
> >> The full debug log is here:
> >> http://www.heimdallit.be/download/spam_debug_1.txt
> >>
> > Hmm, does that say that a bunch of major RBLs is timing out?
> >
> First, UML is a virtual machine infrastructure. See
> http://user-mode-linux.sourceforge.net/

> The timing out is caused because it because when i allow the testing it
> apparently doesn't work. When i specify that dns is available in
> local.cf, then it works as it should.

Unrelated. The debug output clearly shows DNS tests. Thus, dns_available
test and explicitly enabling resulted in the same.


> > Huh. body-points == head-points? Also, shouldn't the sum generally be in
> > the range of total score or less, rather than almost twice?
>
> No idea, i thought it was strange but i can't reproduce it in my latest
> tests with the same message.

"Can't reproduce" either is bad for debugging, or just plain scary.


> > I do have users listed on the whitelist in /etc/spamassassin/local.cf
> >> (whitelist_from)
> >
> > No, you don't. Not in the local.cf you pasted somewhere else in this
> > thread. (Also, FWIW, AWL is not related to whitelist_* options.)
>
> I do have them but i removed them from the local.cf i posted because not
> everybody's email is known publicly.

*sigh* Please, if you must remove confident data, either just mask it
by overwriting the real address, or at the very least mention it. In NO
case, however unrelated it might seem, do NOT just silently remove it
and claim that's your config.


> >> Oct 14 10:10:23 loki spamd[1321]: plugin: eval failed: Sort subroutine
> >> didn't return a numeric value at /usr/share/perl5/Mail/SpamAssassin/
> >> AsyncLoop.pm line 278.
> >
> > This seems *really* odd. Clearly, something's broken.
>
> Yup very strange indeed. After rebooting it doesn't seem to occur anymore.
> The only thing that occurs regarding nan is this:
>
> Oct 14 15:15:13 loki spamd[1393]: auto-whitelist: totscore for
> (benedict.verheyen[at]telenet.be, 78.22.1.208) is a NaN, ignored

Looks like NaN sneaked into your AWL before. This is not the same as
getting NaN as a rule's score.

> This again suggests that something is broken with my AWL. I think i'd
> better delete it.

Yup...

> As it seems now, the only thing strange left is the AWL & related NaN.

guenther


--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Mark.Martinec+sa at ijs

Oct 14, 2008, 8:04 AM

Post #23 of 30 (363 views)
Permalink
Re: spam score not counted correctly [In reply to]

Benedict,


> This again suggests that something is broken with my AWL. I think i'd
> better delete it.
> As it seems now, the only thing strange left is the AWL & related NaN.

Please don't delete your AWL. I'll provide a patch which will reset a
bad entry when it encounters one, so your db will be a good testground.

Guenther wrote:
> Looks like NaN sneaked into your AWL before. This is not the same as
> getting NaN as a rule's score.

Right. A NaN in AWL is just a consequence of your other rules
producing a NaN result. Let's concentrate on how this came to be.

Mark


benedict.verheyen at telenet

Oct 14, 2008, 8:16 AM

Post #24 of 30 (363 views)
Permalink
Re: spam score not counted correctly [In reply to]

Mark Martinec wrote:
> Benedict,
>
>> This again suggests that something is broken with my AWL. I think i'd
>> better delete it.
>> As it seems now, the only thing strange left is the AWL & related NaN.
>>
>
> Please don't delete your AWL. I'll provide a patch which will reset a
> bad entry when it encounters one, so your db will be a good testground.
>
I already deleted it but i had a backup so the original is already restored.
I don't know if it's an option but i could always zip the autowhitelist
and put it on my site so
you can download it for tests?
>> Looks like NaN sneaked into your AWL before. This is not the same as
>> getting NaN as a rule's score.
>>
>
> Right. A NaN in AWL is just a consequence of your other rules
> producing a NaN result. Let's concentrate on how this came to be.
>
>
OK, i'll keep on monitoring my spam. Where's spam when you need it :)

Regards,
Benedict


benedict.verheyen at gmail

Oct 14, 2008, 8:20 AM

Post #25 of 30 (364 views)
Permalink
Re: spam score not counted correctly [In reply to]

Karsten Bräckelmann wrote:

>> as i said, to my knowledge, i'm not using any custom headers and i
>> asked how i could know for sure as it's not clear to me how to check
>
> Ah, sorry, kind of forgot about that. Well, posting your cf files is one
> option. ;) Another one is to read the configuration and check back with
> the docs. [1] In particular, see Basic Message tagging Options.
>
> Anyway, just as I suspected in a previous post, your custom headers from
> local.cf:
> clear_headers
> add_header all Flag _YESNOCAPS_
> add_header all Report _REPORT_
>
> This might be relevant WRT to bug 3364 [2], it definitely matches the
> summary. Can you still reproduce these NaN scores, if you comment out
> the above options?
>

Hi Guenther,


You are right. I previously had another custom header in there with
_HITS_ but i removed it because i first thought this was causing the error.
As for cf's, my local.cf can be found here:
http://www.heimdallit.be/download/local.cf

The other cf's in there are init.pre, v310.pre, v312.pre, v320.pre and
65_debian.cf. None of which i put there but are installed standard by
the Debian package i suspect.

As for reproducing, see last part of this message.

> FWIW, your custom scores are "left overs from earlier configs".
>
> I assume the first blob of score adjustments are the (previously set)
> custom scores for which you've seen NaN results? That's quite a lot and
> appears to affect rules randomly.

Yes indeed but not all of them. I think i've tried for some 6 or 7
scores to get around the nan scores by putting my own score for them in
my local.cf. The rest is just a way of increasing the scores to assure
that spam message get a high enough score.


>>> A wild guess: Since the affected rule/score varies wildly, might the
>>> culprit by any chance be bad RAM?
>>>
>> Bad ram? I seriously doubt but i could test it with a live cd that has
>> the memtest program.
>
> I'd check that, yeah. Rule's scores are set to NaN randomly and
> widespread. Plus the other kind of scary warnings and issues you've
> mentioned in this thread...

Yup the behaviour seems inconsistent.

Anyway, i've done 2 more tests. First i retried and now the message is
flagged as spam but the MSOE_MID_WRONG_CASE part isn't in there anymore
and thus no Nan.
When i comment the custom headers and try again, it's also flagged as
spam and MSOE_MID_WRONG_CASE is also not in there anymore.


Regards,
Benedict

First page Previous page 1 2 Next page Last page  View All SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.