smtp + sql = more than it seems so (part 7)

in previous part of this howto we setup autoresponder/vacation system.

additionally i promised to show how to filter user mails.

first let's assume we want exim to automatically store spam messages in some kind of spam folder. for people who want it.

it means we have to do 2 things:

introduce some kind of spam protection
allow our system to contain message filters per user

first thing – spam protection will be done using standard spam assassin. so let's get it:

apt-get install spamassassin dcc-client pyzor razor  libio-string-perl libio-socket-ssl-perl libnet-ident-perl libdbi-perl dcc-client libmail-dkim-perl libmailtools-perl libhtml-format-perl spamc re2c libsys-syslog-perl gcc libc6-dev make

why so many packages? well, i did: apt-get install spamassassin, and then added all recommended and suggested packages 🙂

at least some of the packages are not important, but i'm not really going to discuss it now.

then, in /etc/default/spamassassin i change “ENABLED=0" to “ENABLED=1".

after this i can:

/etc/init.d/spamassassin start

and verity that it's really working:

=> ps uw -C spamd
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     13227  2.4  1.5  35140 31628 ?        Ss   20:34   0:01 /usr/sbin/spamd --create-prefs --max-children 5 --helper-home-dir -d --pidfile=/var/run/spamd.pid
root     13231  0.0  1.4  35140 29728 ?        S    20:34   0:00 spamd child
root     13232  0.0  1.4  35140 29632 ?        S    20:34   0:00 spamd child
=> netstat -ntlp | grep spamd
tcp        0      0 127.0.0.1:783           0.0.0.0:*               LISTEN     13227/spamd.pid

ok. looks like working. now add exim configuration for it.

first, just before “begin acl" we have to add information where spamd is:

spamd_address = 127.0.0.1 783

then we have to find “acl_check_data" rule, and add (just before final “accept") this config:

warn message = X-Spam-Score: $spam_score
        log_message = SPAM-score: $spam_score_int
    condition = ${if <{$message_size}{200k}{1}{0}}
    spam = nobody:true
 
warn message = X-Spam-Report: $spam_report
    condition = ${if and {{>{$spam_score_int}{45}}{<{$message_size}{200k}}}{1}{0}}
    spam = nobody:true
 
warn message = X-Spam: Yes
    condition = ${if and {{>{$spam_score_int}{45}}{<{$message_size}{200k}}}{1}{0}}
    spam = nobody:true

so, now let's restart exim and check if our spam assassin works.

first let's send some innocent mail:

=> telnet 192.168.0.101 smtp
Trying 192.168.0.101...
Connected to 192.168.0.101.
Escape character is '^]'.
220 localhost ESMTP Exim 4.67 Sat, 01 Mar 2008 20:41:28 +0000
EHLO x
250-localhost Hello x [192.168.0.101]
250-SIZE 52428800
250-PIPELINING
250-AUTH LOGIN
250 HELP
MAIL FROM: <test@exim.depesz>
250 OK
RCPT TO: <depesz@exim.depesz>
250 Accepted
DATA
354 Enter message, ending with "." on a line by itself
Subject: this is innocent mail
 
some content. not much.
.
250 OK id=1JVYWY-0003Yj-1d
quit
221 localhost closing connection
Connection closed by foreign host.

ok. now, let's see logs:

2008-03-01 20:41:45 1JVYWY-0003Yj-1d H=(x) [192.168.0.101] Warning: SPAM-score: 16
2008-03-01 20:41:45 1JVYWY-0003Yj-1d <= test@exim.depesz H=(x) [192.168.0.101] P=esmtp S=265
2008-03-01 20:41:45 1JVYWY-0003Yj-1d => depesz <depesz@exim.depesz> R=pg_user T=pg_delivery
2008-03-01 20:41:45 1JVYWY-0003Yj-1d Completed

and how does the mail look in inbox?

=> cat /mails/exim.depesz/depesz/maildir/new/1204404105.H875282P13698.localhost
Received: from [192.168.0.101] (helo=x)
        by localhost with esmtp (Exim 4.67)
        (envelope-from <test@exim.depesz>)
        id 1JVYWY-0003Yj-1d
        for depesz@exim.depesz; Sat, 01 Mar 2008 20:41:45 +0000
Subject: this is innocent mail
X-Spam-Score: 1.6
 
some content. not much.

looks ok.

now, let's copy/paste some nasty spam.

due to length of the mail i will skip it from here. sorry 🙂

logs:

2008-03-01 20:46:21 1JVYaJ-0003aa-Rs H=(x) [192.168.0.101] Warning: SPAM-score: 244
2008-03-01 20:46:21 1JVYaJ-0003aa-Rs <= test@exim.depesz H=(x) [192.168.0.101] P=esmtp S=3787 id=251101c073b4$393a6650$3a67573a@Abraham
2008-03-01 20:46:21 1JVYaJ-0003aa-Rs => depesz <depesz@exim.depesz> R=pg_user T=pg_delivery
2008-03-01 20:46:21 1JVYaJ-0003aa-Rs Completed

and how does headers of the mail look? let's skip the non-important part, and just show spam-related headers:

X-Spam-Score: 24.4
X-Spam-Report: Spam detection software, running on the system "xxx", has
        identified this incoming email as possible spam.  The original message
        has been attached to this so you can view it (if it isn't spam) or label
        similar future email.  If you have any questions, see
        the administrator of that system for details.
        Content preview:  Grow that manhood in your pants today! http://cratuelis.com/
        Grow that manhood in your pants today! [...]
        Content analysis details:   (24.4 points, 5.0 required)
        pts rule name              description
        ---- ---------------------- --------------------------------------------------
        1.4 NO_DNS_FOR_FROM        DNS: Envelope sender has no MX or A DNS records
        2.3 DATE_IN_PAST_96_XX     Date: is 96 hours or more before Received: date
        0.0 HTML_MESSAGE           BODY: HTML included in message
        1.5 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level
        above 50%
        [cf: 100]
        1.5 RAZOR2_CF_RANGE_E4_51_100 Razor2 gives engine 4 confidence level
        above 50%
        [cf: 100]
        0.5 RAZOR2_CHECK           Listed in Razor2 (http://razor.sf.net/)
        0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
        [cf: 100]
        2.0 URIBL_BLACK            Contains an URL listed in the URIBL blacklist
        [URIs: cratuelis.com]
        1.6 URIBL_AB_SURBL         Contains an URL listed in the AB SURBL blocklist
        [URIs: cratuelis.com]
        2.1 URIBL_WS_SURBL         Contains an URL listed in the WS SURBL blocklist
        [URIs: cratuelis.com]
        2.9 URIBL_JP_SURBL         Contains an URL listed in the JP SURBL blocklist
        [URIs: cratuelis.com]
        2.1 URIBL_OB_SURBL         Contains an URL listed in the OB SURBL blocklist
        [URIs: cratuelis.com]
        2.5 URIBL_SC_SURBL         Contains an URL listed in the SC SURBL blocklist
        [URIs: cratuelis.com]
        0.9 URIBL_RHS_DOB          Contains an URI of a new domain (Day Old Bread)
        [URIs: cratuelis.com]
        2.5 URIBL_SBL              Contains an URL listed in the SBL blocklist
        [URIs: cratuelis.com]
        0.1 RDNS_NONE              Delivered to trusted network by a host with no rDNS
X-Spam: Yes

what's interesting, is the “X-Spam: Yes". it was added (just like x-spam-report) by our exim.

now, we'd like to make all such mails go directly to spam folder.

first we will need a place to store filters:

ALTER TABLE accounts add column filter TEXT;

then, let's set filter for our test user:

update accounts set filter = '#   Exim filter   <<== do not edit or remove this line!
 
if
        $header_x-spam: contains "Yes"
then
        save $home/maildir/.SPAM/
        finish
endif
' where id = 1;

if you dont know why there is this ‘# Exim filter' thing, please consult manual.

one more thing. you might wander why spam folder is named ‘.SPAM' (with dot). i'm not sure if it's imap standard or courier implementation but subfolders start with “.".

now to exim configuration.

first, we need to find original userforward router. it looks like this:

userforward:
  debug_print = "R: userforward for $local_part@$domain"
  driver = redirect
  domains = +local_domains
  check_local_user
  file = $home/.forward
  require_files = $local_part:$home/.forward
  no_verify
  no_expn
  check_ancestor
  allow_filter
  forbid_smtp_code = true
  directory_transport = address_directory
  file_transport = address_file
  pipe_transport = address_pipe
  reply_transport = address_reply
  skip_syntax_errors
  syntax_errors_to = real-$local_part@$domain
  syntax_errors_text = \
    This is an automatically generated message. An error has\n\
    been found in your .forward file. Details of the error are\n\
    reported below. While this error persists, you will receive\n\
    a copy of this message for every message that is addressed\n\
    to you. If your .forward file is a filter file, or if it is\n\
    a non-filter file containing no valid forwarding addresses,\n\
    a copy of each incoming message will be put in your normal\n\
    mailbox. If a non-filter file contains at least one valid\n\
    forwarding address, forwarding to the valid addresses will\n\
    happen, and those will be the only deliveries that occur.

all of this can be removed or modified to make it look like this:

pg_userforward:
  debug_print = "R: pg_userforward for $local_part@$domain"
  condition = ${lookup pgsql {SELECT get_account_homedir('${local_part}', '${domain}')}}
  router_home_directory=${lookup pgsql {SELECT get_account_homedir('${local_part}', '${domain}')}}
  driver = redirect
  data = ${lookup pgsql{ SELECT get_account_filter('${local_part}', '${domain}') }}
  no_verify
  no_expn
  check_ancestor
  user      = ${lookup pgsql{SELECT get_account_uid('${local_part}', '${domain}')}}
  group     = ${lookup pgsql{SELECT get_account_gid('${local_part}', '${domain}')}}
  allow_filter
  directory_transport = address_directory
  file_transport = address_file
  pipe_transport = address_pipe
  reply_transport = address_reply
  skip_syntax_errors
  syntax_errors_to = real-$local_part@$domain
  syntax_errors_text = \
    This is an automatically generated message. An error has\n\
    been found in your .forward file. Details of the error are\n\
    reported below. While this error persists, you will receive\n\
    a copy of this message for every message that is addressed\n\
    to you. If your .forward file is a filter file, or if it is\n\
    a non-filter file containing no valid forwarding addresses,\n\
    a copy of each incoming message will be put in your normal\n\
    mailbox. If a non-filter file contains at least one valid\n\
    forwarding address, forwarding to the valid addresses will\n\
    happen, and those will be the only deliveries that occur.

as you probably notices i used function get_account_filter() which we dont yet have. writing is is luckily simple:

CREATE OR REPLACE FUNCTION get_account_filter(in_username TEXT, in_domain TEXT) RETURNS TEXT as $BODY$
DECLARE
    use_username text := trim(both FROM lower(in_username));
    use_domain   text := trim(both FROM lower(in_domain));
    temptext TEXT;
BEGIN
    SELECT a.filter INTO temptext FROM accounts a join domains d on a.domain_id = d.id WHERE d.fullname = use_domain AND a.username = use_username;
    IF NOT FOUND THEN
        RETURN NULL;
    END IF;
    RETURN temptext;
END;
$BODY$ LANGUAGE plpgsql;

so, having it all ready, let's check it.

=> /etc/init.d/exim4 restart
 * Stopping MTA for restart           [ OK ]
 * Restarting MTA                        [ OK ]
=> find /mails/ -type f  -exec rm {} \;

(second command to make sure that there are no left mails from previous tests)..

so, let's deliver innocent mail:

2008-03-01 21:10:24 1JVYyF-0003oB-1B H=(x) [192.168.0.101] Warning: SPAM-score: 16
2008-03-01 21:10:24 1JVYyF-0003oB-1B <= test@exim.depesz H=(x) [192.168.0.101] P=esmtp S=265
2008-03-01 21:10:25 1JVYyF-0003oB-1B => depesz <depesz@exim.depesz> R=pg_user T=pg_delivery
2008-03-01 21:10:25 1JVYyF-0003oB-1B Completed

and the file got saved here:

=> find /mails/ -type f
/mails/exim.depesz/depesz/maildir/new/1204405825.H121338P14668.localhost

which looks fine.

now, i removed the file, and delivered spam message.

how do logs look like now?

2008-03-01 21:11:49 1JVYzZ-0003p9-Rv H=(x) [192.168.0.101] Warning: SPAM-score: 244
2008-03-01 21:11:49 1JVYzZ-0003p9-Rv <= test@exim.depesz H=(x) [192.168.0.101] P=esmtp S=3787 id=251101c073b4$393a6650$3a67573a@Abraham
2008-03-01 21:11:50 1JVYzZ-0003p9-Rv => /mails/exim.depesz/depesz/maildir/.SPAM/ <depesz@exim.depesz> R=pg_userforward T=address_directory
2008-03-01 21:11:50 1JVYzZ-0003p9-Rv Completed

yeah!

let's check it again on filesystem level just to be sure:

=> find /mails/ -type f
/mails/exim.depesz/depesz/maildir/.SPAM/new/1204405910.H37191P14715.localhost

great.

it has to be also noted that exim made the .SPAM directory (it wasn't there before).

cool 🙂

as always, you can get current version of exim4.conf.template and config.autogenerated files.

in next part of this tutorial i will show you how to add another kind of filtering.

8 thoughts on “smtp + sql = more than it seems so (part 7)”

Robert Kruus says:

2008-03-12 at 07:03

The Spam directory is in maildir format (with the cur,new and tmp directories I assume) and it is following the standard naming convention for maildirs.
depesz says:

2008-03-12 at 10:20

@Robert Kruus:
sorry but i dont understand your comment.

are you saying that i should name the folder Spam and not SPAM?
Robert Kruus says:

2008-03-12 at 17:37

It doesn’t matter what you call it.
Like you said, the dot is a delimiter for subfolders on IMAP servers that use the courier extended maildir format.

Hierarchy Folder name
Inbox/Spam ——> .Spam
Inbox/Spam/Good –> .Spam.Good
Inbox/Spam/Bad —> .Spam.Bad

http://www.courier-mta.org/maildir.html
depesz says:

2008-03-12 at 17:41

@Robert Kruus:
sorry but i still dont understand the point of your comment.
you said:
“The Spam directory is in maildir format (with the cur,new and tmp directories I assume) and it is following the standard naming convention for maildirs.”

now. this is of course true, but this comes directly from the blogpost as well – there is no “added” information in here. or am i missing something?
Theory says:

2008-03-13 at 02:38

depesz,

I think that Robert is saying, in answer to your question in the post, that the dot in “.SPAM” is a maildir convention.

—Theory
depesz says:

2008-03-13 at 11:14

Theory, ah, you’re right.

Robert, sorry – apparently my brain was not fully functional yesterday.

thanks.
Robert Kruus says:

2008-03-13 at 20:42

I think my brain was not fully functional when I posted as well.
Pingback: </depesz> » Blog Archive » smtp + sql = more than it seems so (part 8)

Comments are closed.