crb3's mail stuff

personal pages

crb3's mail stuff

back to the front

Here is where programs go that I wrote to implement and admin our qmail-based local mail service.

allmailscan

allmailscan is an admin tool for seeing who's got what mail waiting for them in their Maildirs. Run from a user account, it lists that user's mail, showing From, Subject, and the timestamp part of the filename. Run from root, it scans all the Maildirs.
The first version was in Perl, but that was too slow for my tastes.
Here's the C version . It makes use of GNU features, so it'll take some work to port it to a non-GNU system. The Perl version, of course, works anywhere that Perl does.

Here's a screenshot of the result of the command allmailscan -mqrm -b/var/vpopmail/users |less in an xterm, showing off all the spam just one user gets in ~/Maildir/qrm in one hour:

screenshot: allmailscan in an xterm

 

sprobe, qrm, etc.

Maildir with /chk and /qrm added

The next few programs form the spam-filtering solution for our subdomain's qmail server. They all work around some extra subdirectories added to the standard Maildirs, as shown here.

We use Charles Cazabon's getmail for subdomain mail fetching from a qmail POP box on the remote virtual host, to our local qmail server. Getmail is modified to deposit new incoming mail in local Maildirs in /chk rather than in /new. Spam eventually ends up in /qrm (the Q-Signal 'QRM' means 'manmade or intentional interference'), where it gets aged out in a week or so, giving users time to go looking for mail which was tested false-positive before it gets deleted. (Typically, that means the admin goes looking when the user comes asking, "Have I gotten...?", which is when allmailscan comes in particularly handy.)

We use a two-step process for spam-filtering. The first step is a sudden-death blacklist- and matching-filter, qrm2, which iterates over the Maildirs, picking up global and per-user rcfiles with match-strings and regexes in them, and using those to push the most obvious spam from /chk to /qrm. It also does whitelisting, pushing those matches from /chk to /new where everyone expects new mail to go.

The second stage uses a downloaded Bayesian filter, SpamProbe , to filter out the garbage. SpamProbe itself is a tester, not a mail-passer, so for Maildir systems it needs a wrapper script, sprobe, to move the spam to /qrm and the good stuff to /new. sprobe is in Perl, but it iterates across all the Maildirs, so it only has a startup delay at the beginning of the sweep, not per-mail; Spamprobe itself is written in C++, so its only startup delay is the lengthy compile at installation. (Avoiding that Perl startup delay on a per-mail basis, and avoiding living with a Perl daemon sitting in scarce server memory, which is how most SpamAssassin installations get around it, was why I elected not to use SpamAssassin.)

As with any Bayesian filter, some mistakes will creep through as the spammers try the boundaries of filters. dqrm moves a piece of mail between /qrm and /new, from where it was to where it wasn't, also copying it into a special-user's Maildir (sp2m) used for teaching SpamProbe about its mistakes; I usually run dqrm on tagged files in ytree just so I don't have to type in the whole filename.

Periodically, cron invokes sprobe_learn to go over the accumulated wrongly-judged emails (in /home/sp2m/Maildir/*) with SpamProbe and rub its nose in them so it won't make that mistake again.

 

Maildir lemonade

After all the spam is filtered out from your mailsystem, what do you do with it? There's an awful lot of it, after all, and more coming in all the time. Bulk post mail, you can at least put through the shredder and use as packing material, but spam's a bit thin for that.
Here are two haha - only serious uses for the stuff.

spam2wav

Strictly for fun (as in, being adventurous or being annoying, your choice), this is a Perl script to convert spam to audio noise by turning it into a WAV file. Actually, any text file, the longer the better, will do as input, so cat together an entire /qrm dir's contents and feed it to this script: the result is almost guaranteed not to be melodic.

fnames

Getting tired of having to invent new names for your fictional characters, especially the throwaways, the 'Redshirts' (to use a Trekkie term)? Thanks to those lovely people, the spammers, help is at hand.

If you've got spam filtered out into its own Maildir (/qrt, in my case, where all spams end up after they've been bitched), you can run this script (which in turn runs allmailscan), to harvest all those lovely normal-sounding names which the spammers invent to make you believe that their spew actually comes from a real person and not from a parasitic life form lower than pond-scum.

Those invented names were sent to you unsolicited and under false pretenses, so you don't have to send them back (as if you could), so you're free to do with them what you will instead. This script harvests them from the emitted output of allmailscan. Once in hand, you can have endless fun devising new cruel and unusual punishments for the spammers they represent.

There are two scripts here, both intended to be cron-driven. One, fnames, collects the day's names into a date-named file. The other script, catnames, can be run by cron or at leisure: it collects and dedupes all those entries in all those files into one big file of donated names. They're both in the tarball; be sure to pick up a copy of dedupe too.


personal pages