Search The ForumSearch   RegisterRegister  LoginLogin

MailBee.NET AntiSpam

 AfterLogic Forum : MailBee.NET AntiSpam
Subject Topic: Antispam training Post ReplyPost New Topic
Author
Message << Prev Topic | Next Topic >>
Loic
Newbie
Newbie


Joined: 29 November 2012
Online Status: Offline
Posts: 5
Posted: 08 January 2013 at 7:12am | IP Logged Quote Loic

Hi !

For unit tests. I try the following statements :

1. create 10 MailMessages and train the antispam to detect them as spam => OK
2. create 10 MailMessages and train the antispam to detect them as NON spam => OK
3. test the score of each mails => OK (10 spam & 10 non-spam detected)
4. Train the antispam with the 10 MailMessages marked as NON spam to detect them as spam => OK
5. test again the score of each mails => PROBLEM, no mails are marked as spam.

I don't reload the antispam during the test.
I just use : TrainFilter, SaveDatabase and ScoreMessage.
What is the logic behind training the antispam ?
Back to Top View Loic's Profile Search for other posts by Loic
 
Igor
AfterLogic Support
AfterLogic Support


Joined: 24 June 2008
Location: United States
Online Status: Offline
Posts: 6104
Posted: 08 January 2013 at 11:59pm | IP Logged Quote Igor

Bayesian filter itself is probability-based, which is why it is usually trained with hundreds of mails for decent spam detection results. We have no idea what will the spam score be if you train the spamfilter with the same messages for both "Spam" and "Not Spam" sides, but we might assume the number of non-spam mails has been set to zero that way, and filter needs both spam and non-spam training of course.

Also, I wonder if doing SaveDatabase and LoadDatabase before step 5 changes anything.

One more thing: if you keep getting score exactly 50, this would indicate there's some kind of problem with the database.

We might be able to help you further on this, if you provide us with ZIP of the sample messages so that we could replicate the situation here and see if anything goes wrong.

--
Regards,
Igor, AfterLogic Support
Back to Top View Igor's Profile Search for other posts by Igor
 
Loic
Newbie
Newbie


Joined: 29 November 2012
Online Status: Offline
Posts: 5
Posted: 09 January 2013 at 6:19am | IP Logged Quote Loic

I think my test is irrelevant. Test a probability-based algorithm is too random.
Back to Top View Loic's Profile Search for other posts by Loic
 

If you wish to post a reply to this topic you must first login
If you are not already registered you must first register

  Post ReplyPost New Topic
Printable version Printable version

Forum Jump

Powered by Web Wiz Forums version 7.9
Copyright ©2001-2004 Web Wiz Guide