Penn Computing
Computing Menu Computing A-Z
Computing Home Information Systems & Computing Penn
Please note: This material is no longer current and appears online for archival purposes only.
Use the search and navigation tools above to locate more up-to-date materials, if they exist.

BogoFilter

Submitted by Don Roeber

* Vendor

Free product available from http://bogofilter.sourceforge.net/
Primary author is Eric S. Raymond, using Bayesian filtering techniques
described by Paul Graham


* Platform (Wintel, *nix, both)

UNIX. Currently tested to compile and run on Linux, FreeBSD, Solaris,
OS X, HP-UX and AIX

* Freeware/shareware/paid product

Free

* How does it function? (Does it rely on external blacklists a la
ORDB, manually-created black/grey/whitelists, MTA blocking, keyword
filtering, etc.)

It creates good and bad word databases, based on the word content of
the message.

* What is its administrative model? (Centrally administered with
Opt-in/Opt-out functionality, fully end-user-administered, etc.)

End user administered

* What options does it provide for disposition of SPAM, once it's been
identified? (Deletion, pre-pending "SPAM!" to the subject line,
generating an NDR, etc.)

It is called within procmail to make a decision. Procmail can decide
what to do with the message based upon the response code given from
bogofilter.

* Ease of administration, server-side. (Installation, sysadmin
maintenance, system resources required, etc.)

Very easy, seemingly low impact (written in C)


* Ease of use, end-user-side. (Ease of configuration, "learning" to
recognize SPAM, etc.)

If the end user is comfortable with a UNIX shell, the basic commands
for updating your word databases are trivial.


* Effectiveness (false positives, misses, etc.)

Amazingly effective. In informal tests, out of 100 messages, 25 were
legitimate and were delivered to the end user. 70 were identified as
SPAM, and acted upon accordingly. 5 weren't identified as SPAM. The 5
that weren't identified were manually added to the bad wordlist
database so that they'll be identified in the future.

* Overall impressions & notes

Amazingly effective. It is also future proof, because it learns based
on the individual users email usage patterns.

 


Please note: This material is no longer current and appears online for archival purposes only.
Use the search and navigation tools above to locate more up-to-date materials, if they exist.

top

Information Systems and Computing
University of Pennsylvania
Comments & Questions


University of Pennsylvania Penn Computing University of Pennsylvania Information Systems & Computing (ISC)
Information Systems and Computing, University of Pennsylvania