Procmail Recipes
Procmail Recipes
1.0 Document id 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 General What is Procmail? Abbreviations and thanks Version information Document layout and maintenance About presented recipes Variables used in recipes About "useless use of cat award"
2.0 Procmail pointers 2.1 2.2 2.3 2.4 2.5 Where is procmail developed Procmail resources Procmail mode for Emacs Procmail module library project Procmail code to filter UBE
3.0 Dry run testing 3.1 What is dry run testing? 3.2 Why the From field is not okay after dry run? 3.3 Getting default value of a procmail variable 4.0 Things to remember 4.1 Get the newest procmail 4.2 C sh's tilde is not supported 4.3 Be sure to write the recipe starting right 4.4 Always set SHELL 4.5 C heck and set PATH 4.6 Keep the log on all the time 4.7 Never add a trailing slash for directories 4.8 Remember what term DELIVERED means 4.9 Beware putting comment in wrong places 4.10 Brace placement 4.11 Local lockfile usage 4.12 Global lockfile 4.13 Gee, where do I put all those ! * $ ?? 4.14 If you Send an automatic reply, use X-loop header 4.15 Avoid extra shell layer and check command for SHELLMETAS 4.16 Think what shell commands you use 4.17 Using absolute paths when calling a shell program 4.18 Disabling a recipe temporarily 4.19 Keep message backup, no matter what 4.20 Order of the procmail recipes 5.0 Procmail flags 5.1 The order of the flags 5.2 Flags HB at top of recipe (warning) 5.3 Flag w and recipe with pipe(|) 5.4 Flag w, lock file and recipe with pipe(|) 5.5 Flag f and w together 5.6 Flags h and b 5.7 Flag h and sinking to /dev/null 5.8 Flag i and pipe flag f 5.9 Flag r 5.10 Flag c's background 5.11 Flag c before nested block forks a child
5.12 Flag c and understanding possible forking penalty 5.13 Flags before nested block 5.14 Flags aAeE tutorial 6.0 Matching and regexps (regular expressions) 6.1 Philosophy of abstraction in regexps 6.2 Matches are not case-sensitive 6.3 Procmail uses multi line matches 6.4 Headers are unfolded before matching 6.5 Improving Space-Tab syndrome 6.6 Handling exclamation character 6.7 Rules for generating a character class 6.8 Matching space at the end of condition 6.9 Beware leading backslash 6.10 C orrect use of TO Macro 6.11 Procmail's regexp engine 6.12 Procmail and egrep differences 6.13 Understanding procmail's minimal matching (stingy vs. greedy) 6.14 Explaining \/ and ()\/ 6.15 Explaining ^^ and ^ 6.16 ANDing traditionally 6.17 ORing traditionally 6.18 ORing and score recipe 6.19 ORing by using De Morgan rules 7.0 Variables 7.1 Setting and unsetting variables 7.2 Variable initialization and sh syntax 7.3 Testing variables 7.4 What does $\VAR mean? 7.5 C ommon pitfalls when using variables 7.6 Quoting: Using single or double quotes 7.7 Quoting: Passing values to an external program 7.8 Passing values from an external program 7.9 Incrementing a variable by a value N 7.10 C omparing values 7.11 Strings: How many characters are there in a given string? 7.12 Strings: How to strip trailing newline. 7.13 Strings: deriving the last N characters of a string. 7.14 Strings: Getting partial matches from a string. 7.15 Strings: Procmail string manipulation example 7.16 How to raise a flag if the message was filed 7.17 Dollar sign in condition lines. 7.18 Finding mysterious foo variable 7.19 Storing code to variable 7.20 Getting headers into a variable 7.21 C onverting value to lowercase 8.0 Suggestions and miscellaneous 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 9.1 9.2 Speeding up procmail See the procmail installation's examples Printing statistics of your incoming mail Storing UBE mailboxes outside of quota Using first 5-30 lines from the message Using cat or echo in scripts? How to run an extra shell command as a side effect? Forcing "ok" return status from shell script Using grep with file lists to mach messages Keep simple header log Gzipping messages
9.3 Emergency stop for your .procmailrc 10.0 Scoring 10.1 Using scores by an example 10.2 Brief Score tutorial 10.3 Score's scope 10.4 C ounting length of a string 10.5 C ounting lines in a message (Adding Lines: header) 10.6 Determining if body is longer than header 10.7 Matching last Received header 10.8 Testing value range with scoring (bogofilter) 10.9 How to add C ontent-Length header 10.10 Testing message size or number of lines 10.11 C ounting commas with recursive includerc 11.0 Formail usage 11.1 Fetching fields with formail -x 11.2 Always use formail's -r switch 11.3 Rewriting the From address 11.4 Formail -r and Resent-From header 11.5 Quoting the message 11.6 Without quoting the message 11.7 How to include headers and body to the reply message 11.8 Adding text to the beginning of message 11.9 Adding text to the end of message 11.10 Adding text before quoted message 11.11 How to truncate headers (save filing space) 11.12 Adding extra headers from file 11.13 Splitting digest 11.14 Mailbox: Splitting to individual files 11.15 Mailbox: Extracting all From addresses from mailbox 11.16 Mailbox: Applying procmail recipe on whole mailbox 11.17 Mailbox: run series of commands for each mail (split mailbox) 11.18 Option -D and cache 11.19 Option -D and message-id in the body 11.20 Reducing formail calls (conditionally adding fields) 11.21 Formail -A -a options 11.22 Formail -e -s options 12.0 Saving mailing list messages 12.1 12.2 12.3 12.4 12.5 12.6 Using subroutine pm-jalist.rc to detect mailing lists Using plus addressing [email protected] Using RFC comment trick for additional information Simple mailing list handling Archiving according to TO Using Return-Path to detect mailing lists
13.0 Procmail, MIME and HTML 13.1 13.2 13.3 13.4 13.5 13.6 Mime content type application/ms-tnef Trapping HTML mime messages C omplaining about HTML messages C onverting HTML body to plain text Getting rid of unwanted mime attachments (HTML, vcard) Sending contents of a HTML page in plain text to someone
14.0 Simple recipe examples 14.1 Saving: MH folders -- numbered messages 14.2 Saving: to monthly folders 14.3 Modifying: Filtering basics
14.4 Modifying: Squeezing empty lines around message body 14.5 Modifying: shuffling headers always to same order 14.6 Service: Auto answerer to empty messages 14.7 Service: Ping responder 14.8 Service: simple vacation with procmail 14.9 Service: vacation code example 14.10 Service: Auto-forwarding 14.11 Service: forward only specific messages 14.12 Service: Making digests 14.13 Kill: killing advertisement headers and footers 14.14 Kill: simple kill file recipe with procmail 14.15 Kill: duplicate messages 14.16 Kill: spam filter with simple recipes 14.17 Kill: (un)subscribe messages 14.18 Time: Once a day cron-like job 14.19 Time: Running a recipe at a given time 14.20 Time: Triggering mail and using cron 14.21 Decoding: Uudecode 14.22 Decoding: MIME 14.23 How to send commands in the message's body 14.24 Matching two words on a line, but not one 14.25 How to define personal XX macros? 14.26 How to change subject by body match 14.27 How to change Subject according to some other header 14.28 How to call program with parameters 15.0 Miscellaneous recipes 15.1 15.2 15.3 15.4 15.5 15.6 15.7 Matching valid Message-Id header Sending two files in a message Excessive quoting of message Sending message to pager in chunks Playing particular sound when message arrives C ombining multiple Original-C c and Original-To headers Forwarding sensitive messages in encrypted format
16.0 Procmail and PGP 16.1 Decrypt pgp messages automatically 16.2 Getkeys from key server 16.3 Auto grab incoming pgp keys 17.0 Includerc usage 17.1 17.2 17.3 17.4 17.5 17.6 17.7 17.8 Using: multiple rc files Using: call rc file conditionally Using: autoloading an rc file Making: naming of the rc file Making: Using name space when saving procmail variables Making: Public and private variables in rc file The rules of thumb for constructing general purpose rc file An includerc skeleton
18.0 Mailing list server 19.0 C ommon troubles 19.1 19.2 19.3 19.4 19.5 19.6 19.7 19.8 Procmail modes: normal, delivery, and mail filter. Procmail as sendmail Mlocal mail filtering device Procmail doesn't pass 8bit characters My ISP isn't very interested in installing procmail My ISP has systemwide procmailrc; is this a good idea? Procmail changes mailbox and directory permissions C hanging mbox permission during compilation to 660 The .forward file must be real file
19.9 Using .forward if procmail already is LDA 19.10 Mail should be put in the mailqueue if write fails 19.11 Qmail: how to make it work with procmail 19.12 Qmail: Procmail looks file from /var/spool/mail only 19.13 Qmail: patch to procmail 3.11pre7 to work with Maildirs 19.14 AFS: How to use Procmail when HOME is in AFS cell 19.15 Help, some idiot sent my address to 30 mailing lists 19.16 Help, Procmail beeps and prints to my console 19.17 Help, procmail dumps mail to console 19.18 Help, corrupted From_ line in mailbox 19.19 Directing user's mail to HOME instead of /var/spool/ 19.20 NFS mounting /var/mail is a good way to get bad performance 19.21 I can't see the sendmail's response in LOGFILE 19.22 C ompiling procmail and choosing locking scheme 19.23 Forwarding lot of mail causes heavy load 19.24 What happens to mail if MDA Procmail fails 19.25 Procmail reads entire 90Mb message into memory 19.26 Procmail signaled out of memory in my verbose log 19.27 Variables DEFAULT and ORGMAIL 19.28 When DEFAULT cannot be mailed to 19.29 Variable DROPPRIVS 19.30 Variable HOME 19.31 Variable HOST 19.32 Variable LINEBUF 19.33 Variable LOG and LOGFILE 19.34 Variable TRAP 19.35 Variable UMASK 19.36 UMASK and permissions 19.37 Performance difference between back tick and "|" recipe 19.38 Procmail's temporary file names while writing file out 19.39 Parameter $@ 19.40 Procmail variables are null terminated (detecting null string) 19.41 FROM_DAEMON TO and TO_ and case-sensitiveness 19.42 TO_ macro deciphered 19.43 TO_ macro and RFC 822 19.44 FROM_DAEMON deciphered 20.0 Technical matters 20.1 20.2 20.3 20.4 20.5 20.6 20.7 List of exit codes List of precedence codes Sendmail and -t RFC 822 Reply-To and formail problem with multiple recipients Procmail and IMAP server Machine which processes mail C ompiling procmail and MAILSPOOLHOME
21.0 Procmail software for Emacs 21.1 21.2 21.3 21.4 21.5 21.6 21.7 What is Emacs Emacs procmail-mode and Procmail code checking (Lint) Why use procmail with Gnus Setting up Gnus for procmail - Basics Gnus for procmail - More about it Emacs and Gnus -- Fiddling with spool files Gnus article snippets
22.0 RFC , Request for comments 22.1 22.2 22.3 22.4 RFC s and their jurisdiction (munged Addresses) C omments about addresses munging RFC and valid mail address characters RFC and login-name@fdqn
22.5 RFC s and messages signature 22.6 RFC and using MIME in Usenet newsgroups 22.7 Some RFC Pointers 23.0 Introduction to E-mail Headers 23.1 23.2 23.3 23.4 23.5 To find out more about mail Lecture by Alan Stebbens Applied to received messages Bcc lecture by Alan Stebbens Bcc lecture by Philip Guenther
24.0 Message headers 24.1 What is correct From address syntax 24.2 What's that X-UIDL header? 24.3 What is that first From_ header? 24.4 Message-Id header 24.5 Received header 24.6 Return-Path 24.7 Errors-To 24.8 X-Subscription-Info 24.9 Reply-To header 24.10 Mail-C opies-To header 24.11 Mail-Followup-To and Reply-To-Personal headers 24.12 C ontent-Length header and From_ specification 24.13 Moral about C C copies in Usenet
1.0 Document id
1.1 General
Copyright 1997-2012 Jari Aalto Homepage http://pm-doc.sourceforge.net URL links last checked: 2010-12-05 License: This material may be distributed only subject to the terms and conditions set forth in GNU General Public License v2 or, at your option, any later version. Procmail is powerful mail handling tool and a lot of space here has been devoted to discuss about UBE (aka Spam) and its essence. This is a Procmail Tips page: a collection of procmail recipes, instructions, howtos. You will also find many other interesting subjects that discuss about Internet mail in general: mail headers, MIME and RFCs. Another part of this document is dedicated to Emacs and its maili handlling capabilities. Emacs is powerful tool that can be used for both mail and news reading; available in Windows platform as well. The tips have been compiled from the procmail discussion list, from comp.mail.misc and from the author's own experiences with procmail. This document does not intend to teach the basics of procmail, instead you should be familiar with the procmail manual pages already. If you're using Windows operating system, procmail is available in Cygwin <http://www.cygwin.com/> distribution. See also Nancy's and Era's procmail FAQ pages. If you find errors or things to improve in this document, please send mail to this document's maintainer (see project page). If some URL is not alive any more, you may still be able to find it by using a WWW search such as Google. There is never too much to learn about procmail and the best source is the rc files that people have done. If you have some time, please place your .procmailrc with good comments to your home page.
Links checked. Minor updates. Links checked. License at footer fixed. Update formail -rt option to recommended -t. Links checked.
2008-09-21 2008-03-10 2007-10-02 2006-02-15 2004-10-10 2002-08-31 2002-08-13 2002-02-01 2002-01-28 2001-08-09 1999-12-27 1999-10-01 1999-04-26 1999-04-21 1999-03-29 1999-02-26 1999-02-23 1998-01-29 1998-01-07 1998-12-14 1998-11-26 1998-10-30 1998-10-21 1998-10-12 1998-10-06 1998-08-29 1998-08-24 1998-08-24 1998-08-18 1998-08-10 1998-06-24 1998-06-23 1998-06-19 1998-06-17 1998-04-03 1998-04-02 1998-03-23 1998-03-10 1998-01-30 1997-12-30 1997-12-09 1997-11-25 1997-11-08 1997-10-13 1997-10-11 1997-10-01 1997-09-18 1997-09-16 1997-09-14 1997-09-13
510 Links checked. 510 Add Gmane URL. Links checked. 519 New HTML/CSS layout. Links checked. 519 Sanitized all email addresses. 516 Spam related things removed. 596 Removed old UBE pointers. 596 Removed old UBE pointers. 608 Spelling checked with Emacs ispell 608 URL links checked and updated 608 http://pm-doc.sourceforge.net opened. 603 Netscape spam filters added 602 Mark Seiden's patch applied. Now under CVS. 599 document moved to www.procmail.org 597 Links corrected 597 Ricochet -- Perl script to fight UBE 592 procmail's Y2K compliance 590 RFC and using MIME in Usenet postings 587 Added "Lua" language pointer 579 Eli's procmail recipes in module section 578 Philip took care of bugs/patches listing 602 More Richard's comments integrated 595 Richard's english correction patch 591 UMASK, .forward if procmail already is LDA 583 SmartList and other MLM software discussed 575 PLUS addr. Convert HTML body to text 565 Fetching fields with formail -x 554 Procmail doesn't pass 8bit characters 553 Flag c forking study, procmail wish list 541 Small changes. MIME notes 529 Guido.Van.Hoeck's 55k patch applied 526 Added live urls to procmail archive 521 All recipes checked by eye. Many fixes. 516 Detecting mailing lists with pm-jalist.rc 510 How to disable recipe quickly with 493 Includerc rewritten, plus addressing 488 ORing and supreme scoring added 471 All recipes checked (by eye) 469 Better ordering: ORing rules discussed 429 "regexp" section rewrite. 415 up till 1996-12 is now included 343 up till archive 1996-07 now included 260 218 Era's correction suggestions. 181 archive file 1995-10's tips included 142 127 94 76 53 46 (k)
The t2html converter project is at http://freecode.com/projects/perl-text2html SENDING IMPROVEMENTS If you have any spare moment, a glimpse to find some spelling mistakes or misuse of the verbs, please go ahead and send a patch to maintainer of this page. The preferred way to send corrections to this document is as diff(1) output. Here's how to make corrections send them forward. Please try to use unified diff -u option. The source is available at version control repository of http://sourceforge.net/projects/pm-doc
cp pm-tips.txt pm-tips.txt.orig ... load the pm-tips.txt to a text editor / edit / save ... Generate the difference diff -bwu pm-tips.txt.orig pm-tips.txt > pm-tips.txt.patch ...Send content of pm-tips.txt.diff by mail to maintainer
If you do not know what a diff format is, then simply send your comments in email. Use "Linux: pmdoc" as subject to bypass spam filtering.
But in this document a strict style has been adopted, where literal strings are assigned with double quotes:
var = "value"
That's because the procmail code checker (Emacs package tinyprocmail.el) then won't warn about
missing dollar-sign, which might have very well been forgotten. Emacs package font-lock.el, a syntax highlighting assistant, also displays double quoted string in color.
If you do this...
var = value # # then you might have made a typo. It is in fact not clear what was intended: # Did you mean: literal assignment? # Did you mean: variable assignment?
Recipe flags are also not stuck together, because the visual distinction of :0 and flags is a valuable one. Reasoning for which flags are kept together and in which order is explained later in details.
# # # #
Pure newline; typical usage if you want to write Something directly to procmail's active logfile: LOG = "$NL message $NL"
NL = " "
= "
"
# whitespace: space + tab # Regexp: space + tab # whitespace + linefeed: spc/tab/nl # negation # # # # # shortname: like perl -- \s A digit -- Perl \d A word -- Perl \w A word -- Perl \W A word, only alphabetic chars
Writing recipes is now a little easier and may look more clear at least to people that have accustomed reading Perl regular expression short names:
:0 *$ Header-Name:$s+$d+$s+$d
SUPREME = 9876543210, is the highest score value that causes procmail to bail out. [david] Actually the maximum is 2147483647, but 9876543210 is easier to remember/type and will function just as well. PMSRC = Procmail module source code directory. Location where *.rc files reside. Anywhere you want it to be. Usually $HOME/pm or $HOME/procmail/lib. Here you can keep the procmail files, log files and includerc scripts. Another common used synonym is PMDIR. SPOOL = Directory where your procmail delivers the categorized messages. Like mailing lists:
If you read the procmail-delivered files directly, this directory is usually $HOME/Mail or $HOME/mail. If you use some other software that reads these files as mail spool files (like Emacs Gnus), then this directory is typically $HOME/Mail/spool or similar. MYXLOOP = Used to prevent re-sending messages that have already been handled. Typically $LOGNAME@$HOST, but this can be any user chosen string. Make it it unique to your address. In this document the definition is:
SENDMAIL = Program to deliver composed mail. Usually standard Unix sendmail(1), but it must have some switches with it. See man page for more. We use following definition in scripts:
NICE = In a Unix environment you can lower the scheduling priority with nice(1). If you are conscious of how many external processes you launch for each piece of mail it would be polite to
lower the priority of such processes. You may see in this document that external processes are called with NICE enabled:
:0 w | $NICE script.pl
IS functions; Functions to test file or directory attributes. E.g. IS_EXIST is defined as "test -e" and so on. The definitions of IS functions are system-dependent. E.g. On Irix the "-e" option is not recognized and the nearest equivalent is "test -r". All IS functions are defined in the pm-javar.rc module.
$ cat file.name.this | wc -l
Instead he writes that the call shou ld have been written like this, which saves the pipe (never mind that wc can read the file directly; this is an example).
$ wc -l < file.name.this
[Paul David Fardy <pdf A T morgan.ucs.mun.ca>] There is weight in the pipeline, but the true cost is in process startup. Try running wc 100 times on /etc/motd or on this message. My tests show the useless use of cat doubles the real and processing time (real, user, and system time are each roughly doubled):
$ cat > /tmp/randal << 'EOF' COUNT=100 i=1 while : do wc < /etc/motd > /dev/null i=$(expr $i + 1) [ "$i" = "$COUNT" ] && break done EOF $ cat > /tmp/useless << 'EOF' COUNT=100 i=1 while : do cat /etc/motd | wc > /dev/null i=$(expr $i + 1) [ "$i" = "$COUNT" ] && break done EOF # NOTE: The timing values should be read as absolute, but # examine the relative differencies.
$ time sh /tmp/randall real 0m0.568s user 0m0.208s sys 0m0.348s $ time sh /tmp/useless real 0m0.825s user 0m0.348s sys 0m0.476s
This becomes important, for example, when you decide to filter all your mail with procmail looking for virus signatures for example. I might well decide to look only at the first 3 or 4 kilobytes. It's not the size of messages--most are small anyway but the number of messages that cause a problem. Do you want to double the processing cost of all our mail? I'm looking at a system-wide filter for all my users' mail. I'm considering Sendmail's mail filter versus procmail filtering. I'll likely be using a bit of both. And given that all of the filtering really just getting in the way of legitimate traffic, it'd really piss me off if I naively doubled the cost.
*** 1997-11-24 22:13 (pm.lint) 3.11pre7 tinypm.el 1.80 cd /users/jaalto/junk/ pm.lint:010: Warning, no right hand variable found. ([$`'] pm.lint:055: Pedantic, flag orer style is not standard `hW:' pm.lint:060: Warning, message dropped to folder, you need lock. pm.lint:062: Warning, recipe with "|" may need `w' flag. pm.lint:073: Warning, Formail used but no `f' flag found.
Procmail module library. The idea of plug-in modules was originally coined by Alan Stebbens (<alan.stebbens A T software.com>, <alan.stebbens A T openwave.com>). 2.4.2 Term inology subroutine/module = A piece of code that gets something in INPUT and responds with OUTPUT. Subroutine is not message specific. recipe = A piece of code that is somewhat self contained: It reads something from the message or does something according to matches in message. Recipe may be message-specific. Recipe is more free-form and does not follow strict INPUT/OUTPUT methodology. 2.4.3 Foreword to using m odules In the module listing, some of the modules are recipes and some can be considered subroutines. Let's take the address exploder. First, visualise following familiar programming language pseudo code:
Function may return multiple arguments and multiple arguments can be passed to it. Clear so far. The concept applies to procmail modules like this:
RC_FUNCTION = $PMSRC/pm-xxx.rc # name the subroutine/module RC_FUNCTION2 = ... INPUT INCLUDERC = "value" = $RC_FUNCTION # Set the arg1 for module # Call Function( $arg1 ) # Examine function's return value
This should be pretty clear too. You just have to look into the subroutine/module which you intend to use, to find out what arguments it wants which you need to set (INPUT) before calling it. The documentation also tells you what values are returned, e.g. one of them was ERROR. If it were recipe, the call would be almost the same, but instead of returning values, the recipe/module most likely does something to your message or writes something to the data files etc. A recipe is much higher level, because it may call multiple subroutines/modules. The distinction between subroutine and recipe module type is not crystal clear, but I hope the above will clarify a bit the Procmail module/subroutine/recipe concept. 2.4.4 Header file m odules These are like #include .h files in C, they define common variables, but do not contain actual code. pm-javar.rc Defines standard variables: SPC WSPC NSPC SPCL and perl styled \s \d \D \w \W and \a \A (alphabetic characters only) headers.rc From Alan's procmail-lib. Define standard regexp and macros: address, from, to, cc, list_precedence 2.4.5 General m odules pm-jafrom.rc Derive FROM field without calling formail unnecessarily. If all else fails, use formail. get-from.rc From Alan's procmail-lib. get the "best" From address. Sets FROM and FRIENDLY, the latter being the "friendly" user name sans address. pm-jaaddr.rc Subroutine to extract various mail components from INPUT. Like
[email protected], net=com, account=foo... pm-jastore.rc Subroutine for general mailbox delivery. Define MBOX as the folder where to drop message and this subroutine will store it appropriately. Supports single mboxes, ".gz" mbox files, directory files and MH folders with rcvstore. 2.4.6 Spam m odules Read "Thoughts about increasing spam annoyance" at <http://pm-lib.sourceforge.net/README.html> which explains these modules better in context "2.0 A lightweight UBE block system with pure procmail". pm-jaube.rc Subroutine to investigate the message for know spam pattern like numeric address, invalid address, Pegasus bulk mail, advertising slogans etc. This is the generic Spam detection module. Needs only one external program: nslokup1(1) to verify the sender's domain. The results of classification appears in returned variables that the caller can use for deciding what to do. Optional headers can be added to the message to announce the results. pm-jaube-keywords.rc Subroutine to scrutinize the message against known spam keywords. This is the "bare bones" and very simplistic (but fast) way to check if message is Spam. The results of classification appears in returned variables that the caller can use for deciding what to do. pm-jaube-prg-runall.rc An Interface module to call external statistical bayesian spam classifier programs. This subroutine will call other modules, like pm-jaube-prg-bogofilter.rc (for bogofilter), pm-jaube-prg-bsfilter.rc (for bsfilter) and many many more that help fighting spam. It is possible to activate specific bayesian programs available in current host. 2.4.7 Mim e m odules pm-jamime.rc Subroutine to read MIME headers and put the mime version, boundary string, content-type information to variables. pm-jamime-decode.rc recipe to decode quoted-printable or base64 encoding in the body. pm-jamime-kill.rc Recipe for attachment killing: wipes out the extra mime cruft leaving only the plain text. Applications for killing: ms-tnef attachment (MS Explorer 7k), HTML attachments (Netscape, MS Express) vcard (Netscape), PCX attachment (Lotus Notes). pm-jamime-save.rc Recipe for saving simple file attachment. When you receive ONE file attachment in a message, this recipe can save it in a separate directory. The content is also decoded (base64,qp) while saving. 2.4.8 Filtering m essage body or headers pm-jadaemon.rc Handle DAEMON messages by changing subject to reflect a) the error reason b) to whom the message was originally sent c) original subject sent and what was the subject. Store the DAEMON messages to separate folder. pm-jasubject.rc Standardize Subject "Re : FW: Sv: message" or any other derivate to de facto "Re: message" pm-janetmind.rc [obsolete] Reformat minder.netmind.com messages (no longer exists 2005). The default 4k message is shortened to a few important lines. 2.4.9 Mailing list m odules pm-jalist.rc Subroutine to extract mailing list name from message. Do you need to add a new recipe to your .procmailrc every time you subscribe to new mailing list? If you do, take a look at this module, which examines the message and defines variable LIST to hold the mailing list name. You can use it directly to save the messages adaptively to correct folders. No more hand work and manual storing of mailing list messages. 2.4.10 Miscellaneous m odules pm-jaempty.rc check if message body is empty (nothing relevant). Define variable BODY_EMPTY to "yes" or "no" if message is empty. pm-janslookup.rc Run nslookup on given address. If you compose return address with "formail -r -x To:" you can verify if domain is registered before sending reply. Uses cache for already looked up domains. This module is alos used by the pm-jaube.rc to verify the sender's domain. guess-mua.rc Guess the Mail User Agent and set MUA: MH,PINE,MAIL
32
2.4.11 Low-level Date and tim e handling For these, you get the date string from somewhere, then feed it to some of these subroutines: pm-jatime.rc a low-level subroutine. Parse time "hh:mm:ss" from variable INPUT pm-jadate1.rc a low-level subroutine. Parse date "Tue, 31 Dec 1997 19:32:57" from variable INPUT pm-jadate2.rc a low-level subroutine. Parse ISO standard date "1997-11-01 19:32:57" from variable INPUT pm-jadate3.rc a low-level subroutine. Parse date Tue Nov 25 19:32:57 from variable INPUT pm-jadate4.rc Call shell command "date" once to construct RFC "Tue, 31 Dec 1997 19:32:57" and parse the YY MM HH and other values. You usually use this subroutine if you can't get the date anywhere else. 2.4.12 Higher- lev el Date and tim e handling You use these recipes to get the date directly from the message: pm-jadate.rc higher-level recipe. Read date from message's headers: From_ Received, or call shell date if none succeeds. date.rc higher-level recipe. From Alan's procmail-lib: parse date or from headers ResentDate:, Date, and From 2.4.13 Forwarding and account m odules pm-japop3.rc Pop3 movemail implemented with procmail. You can send a "pop3" request to move your messages from account X to account Y. Each message is send separately. This recipe listens to "pop3" requests. pm-jafwd.rc control forwarding remotely. You can change the forward address with a "control message" or turn forwarding on/off with a "control message" pm-japing.rc Send short reply when subject contains the word "ping" to show that the account is up and mail address is valid. correct-addr.rc From alan's procmail lib. To help forward mail from an OLD address to a NEW address, and do some mailing list mail management. This recipe file is intended to make it easy for users to forward their mail from their old address to a new address, and, at the same time, educate their correspondents about it by CC'ing them with the mail. 2.4.14 Vacation m odules pm-javac.rc A framework for your vacation replies. This recipe will handle the vacation cache and compose an initial reply; which you only need to fill in. (Like putting vacation message to the body) ackmail.rc From Alan's procmail lib. procmail rc to acknowledge mail (with either a vacation message, or an acknowledgment) 2.4.15 Message- id based m odules pm-jadup.rc Handle duplicate messages by Message-Id. Store duplicate message in separate folder. dupcheck.rc From Alan's procmail-lib. If the current mail has a "Message-Id:" header, run the mail through "formail -D", causing duplicate messages to be dropped. Can use MD5 hash in cache. 2.4.16 Cron m odules pm-jacron.rc A framework for your daily cron tasks. This recipe contains all the needed checks to ensure that your includerc is called whenever a day changes. (Day change is subject to messages you receive). Your own cron includerc is run once a day. 2.4.17 Backup m odules pm-jabup.rc Save messages to backup directory and keep only N messages per day. Idea by John Gianni. Note: The implementation will always call shell for each message you receive; so using this module is not recommended if you get many messages per day. Instead, use the
cron module to clean the messages' backup directory only once a day, and not every time a message arrives. 2.4.18 Confirm ation m odules pm-jacookie.rc Handle cookie (unique id) confirmations. Also known as Procmail authentication service (PAS). This simple procmail module will accept messages only from users who have returned a "cookie" key. You can use this to to protect some services before access. Uses subroutine pm-jacookie1.rc, which generates the unique cookie; CRC 32 by default. NOTE: Please read page <http://pm-lib.sf.net/README.html> before you may start thinking to use this module as a generic Challenge-Response module to reduce spam.
pm-jaube.rc -Procmail module library's UBE filter After Daniel Smith posted his spam recipes to procmail mailing list, the code was adopted and more generalized to handle lot more UBE. Module needs no special setup and can be installed via simple INCLUDERC. All UBE detection happens using procmail rules with no external files needed. The module is available in Procmail module library at <http://freecode.com/projects/procmail-lib>. 2.5.1 o Catherine A. Hampton's Spambouncer". The attached set of procmail recipes/filters, which I call The Spam Bouncer, are for users who are sick of spam (unsolicited junk mail) and want to filter it out of their mail as easily as possible. These recipes can be used as shared recipes for a whole system, or by an individual for their own mailbox only. Junkfilter. by Gregory Sutter. Junkfilter is a user-configurable procmail-based filter system for electronic mail. Recipes include checks for forged headers, key words, common spam domains, relay servers and many others. Nonplussed Spambouncer Procmail module for bouncing spam. Requires sendmail with plussed users.
The script pm-test.rc has the procmail recipe you're testing or improving. The test-mail.txt is any valid mail message containing the headers and body. You can make one with any text editor, e.g. vi, pico, nano, emacs or xemacs. Here's a simple test mail skeleton. Copy verbatim:
From: [email protected] To: [email protected] (self test) X-info: I'm just testing BODY OF MESSAGE SEPARATED BY EMPTY LINE txt txt txt txt txt txt txt txt txt txt
Remember that you can define environment variables as well in the dry run call. Here's an example where procmail just executes the script and does nothing fancy.
Suppose the script prints something to log files, but you'd instead like to get it all dumped to screen. No problem, first find out your tty value by calling tty at shell prompt and pass that on the command line. Here the default LOGFILE is directed to take care of redirecting "LOG=" commands and statement:
3.2 Why the From field is not okay after dry run?
Why it now says "From foo@bar Mon S e p 8 14:38:06 1997 "?
Don't worry about this. It's a side-effect of running the message through formail after having generated any auto-reply the auto-reply generated by "formail -r" doesn't have a "From " header (it's pointless for outgoing messages), so the second formail adds one, not knowing that it'll just be ignored by sendmail later (well, sendmail will extract the date from it, but that's ignorable). You only see it because you're saving to a folder instead of the mailing it.
Since LOGFILE hasn't been defined, $PATH will be printed to the screen. One caution: if there are any variables in the definition of $PATH (such as $HOME), they'll be expanded in the output.
% procmail -v
Beware writing it 0: as it happens easily. Always put a zero after the colon that begins the recipe. In the first versions of procmail, you would put the number of conditions, with a default of 1. That was annoying, and the computer can do the counting easier, so Stephen made it so that a count of 0 indicates that the conditions are all the lines beginning with a *. The default is one, unless the a, A , e, or E flags is given, in which case the default is zero. ALWAYS START a RECIPE WITH :0.
SHELL = /bin/sh
4.4.1 If system has no /bin/sh and you're forced to use csh/tcsh [<kuhlmav A T elec.canterbury.ac.nz>] Csh and tcsh execute the .cshrc first, THEN if, and only if it is the login shell (not a sub shell) it executes the .login, which should contain basic important system setting like stty commands. Likewise, bash and ksh users are taught to define and export PATH in profile, so our per-shell startup files would not have clobbered the PATH set in .procmailrc the way your .cshrc did. [philip] ...I have been told by other sysadmins that there are systems on which csh was hacked to source the .login before the cshrc. For various reasons I suspect these to be systems based on older versions of BSD (say, 2.3 BSD). As for tcsh, the order in which the .login and .cshrc is sourced is a compile-time option which defaults to the .cshrc (or .tcshrc) before the .login. There may be some wackos out there who change the default in memory of the system(s) that they were raised on. I suggest electroshock as the proper treatment.
...done sys admin on Crays, Conve xe s, S uns, S GI s, De cs, PC running BS DI , Linux and Fre e BS D, and I have ne ve r run into a syste m whe re the .cshrc is source d AFTER the .login. I f some one goe s to the trouble to change the orde r, I would love to know a valid re ason for it.
4.4.2 Procm ail won't work well with SHELL set to csh derivate [1998-08-17 PM-L <kuhlmav A T elec.canterbury.ac.nz> Volker Kuhlmann] ...The blame lies with procmail and its documentation. Obviously, procmail is programmed with the assumption that the login shell is a sh derivative. This assumption is a) not very nice, and b) not stated in the otherwise very good documentation. Of course a user can set SHELL to tcsh. If then procmail is too stupid to hack it, it ought to say so clearly, and the above-mentioned questions of people using tcsh will disappear from this list. One could also be nice and point out pitfall (3) mentioned above in the procmail docs. It is customary to have terminal configuration in .login. If it is shifted to .cshrc it should be properly surrounded by if .. endif. Perhaps it is not customary to configure the terminal in bashrc (where else then? - only a rhetorical question), but that is no reason to blame it on tcsh. My .cshrc only setenvs the environment when it is a login shell (shell level 1). Obviously procmail runs a login shell. As I said earlier, there are good reasons for setting a full PATH independently whether the shell is interactive or not. So, when procmail executes programs with SHELL=tcsh, PATH is set to the tcsh defaults. That may or may not be desirable, depending on the individual case. No problem with that and avoidable (run tcsh with -f). Nice if it was in the procmail docs. But then, the PATH getting clobbered is not the point here (just a side-effect I didn't realize until 2 people pointed it out).
PATH = \ $HOME/bin:\ /usr/local/gnu/bin:\ /usr/contrib/bin:\ /usr/local/bin:\ /opt/local/bin:\ /bin:\ /usr/bin:\ /usr/lib:\ /usr/ucb:\ /usr/sbin:\ /vol/bin:\ /vol/lib:\ /vol/local/bin:\ ${PATH}
DIR FILE
= /full/path/to/www/directory/ = $ARCHIVEDIR/file
# Wait... # Ouch !
:0 # comment ok * condition # OUCH, ouch. This comment must not be here. # Hm, Old procmail versions don't understand this # Are you sure you want to put comments inside # condition line? * condition { # comment ok # comment ok :0 # comment ok /dev/null # comment ok } # comment ok
So, the place to watch is the condition line. Later procmail versions may understand those, but if you intend to share your recipe, play it safe and think about backward portability.
:0 * condition {} :0 E {do_something } # No space allowed here! # Wrong, at least _one_ empty space # Again mistake, must have surrounding spaces
:0: a :0 a:
Note that in delivering recipes where you manually write the content, you must use local lock file with > token, because procmail can't determine lock by itself. It can only determine the lock file from the >> token. However, putting a lock file on a recipe like this is, of course, utterly useless. So you might as well omit the locking entirely.
If the command line in the procmail rcfile contains ">", a name for the local lock file will be implicit, and the second colon alone is enough. If the command doesn't write to a file, or doesn't write to the same file as anything else (including a matching letter that makes procmail run the same command) that might run at the same time, the local lock file is unnecessary. Watch this too. A nesting block that does not launch a clone cannot take a local lock file on the recipe that starts the braces. A nesting block that does launch a clone can. (see the error)
:0: file$LOCKEXT { # error: "procmail: Extraneous local lock file ignored" # - This lock file will be ignored # - If the recipes inside the braces try to use file.lck # as a lock file, then you'll have a deadlock situation. :0 : /tmp/tmp.mbx }
Let me also explain why the w is so important. Notice, that the two here are equivalent. The W here is implicit. NOTE: this is only true on the recipe that opens a nested block. On a recipe with a program, forward, or delivery action, W' is different from w is different from missing both.
:0 c: file$LOCKEXT { ... }
To quote the comment in source code, "try and protect the user from his blissful ignorance". The parent will always wait for the cloned child to exit when a lock file is involved. The only question is whether or not it should be logged. If you want failure of the cloned child to be logged, then you should use the w flag, ala:
A local lockfile can be used to lock a clone; the parent procmail will remove it when the clone exits (thus it serves as a global lock file for the clone). If the braced block does not launch a clone, asking for a local lock file generates an error.
* ? $FORMAIL -D $MID_CACHE_LEN $MID_CACHE_FILE { LOG = "dupecheck: discarded $MESSAGEID from $FROM $NL" :0 $DUPLICATE_MBOX } LOCKFILE # kill variable # no lockfile !
because the local lock file named on the flag line will be created only if the conditions have matched and the action is attempted. One more note: watch carefully, that there is no : lock when delivering to DUPLICATE_MBOX because the outer global lock file already prevents all other procmail instances from executing this part of the recipe.
*$ ! ? BH VAR ?? test
That won't say much unless you see something to compare with. Here is one perfectly valid rule, not following the above style:
It might be better to line up things in multiline condition lines. The first place is reserved for dollar sign, the second for not operator and so on. The key here is to be able to comprehend the recipe easily. E.g. with the dollar put to the left, one can tell at a glance if variable expansion is happening on the whole line. The same with small formatting changes:
:0 *$ * * *$ * |
| | | | | What is matched: (H)eader portion, (B)ody or (HB) both. | | The (??) associative operator is required. | | | Not operator (!) or shell call (?) | Variable expansion (important)
4.15 Avoid extra shell layer and check command for SHELLMETAS
[dan] It is very important to study your shell command calls and try to save the overload of the extra layer of shell. It may be extra work once when you write your rcfile but it saves effort on each piece of arriving mail. When procmail sees a character from SHELLMETAS, it runs
# Default SHELLMETAS: &|<>~;?*[ # Default $SHELLFLAGS: -c % $SHELL $SHELLFLAGS "command -opts args"
instead of
That is because procmail's ability to invoke other programs does not include filename globbing ([, *, ?), backgrounding (&), piping (|), succession (;), nor conditional succession (&&, ||). If it sees any of those characters (before expanding variables), it hands the job over to a shell. Sometimes those characters appear in arguments to a command without having their shell meta meaning and procmail really could invoke the command directly without the shell. You can see the distinction in a verbose log file: if procmail runs the command itself, it logs
Executing "command,-opts,args"
with a comma between each positional parameter, but if it calls a shell, the original spacing from the rcfile appears unchanged in the logfile:
So, if you know you won't be needing shell expansion, wrap your shell calls with this:
savedMetas SHELLMETAS
..command that does not need shell expansion features.. SHELLMETAS = $savedMetas
/usr/bin/awk /usr/bin/sort /usr/bin/grep /usr/bin/sed /usr/local/bin/gawk /usr/contrib/bin/perl /opt/local/bin/perl text 72727 173225 237248 221591 502220 633812 160018 data 51316 18496 16992 16176 36044 69612 5264 bss + 15317 + 183076 + 56252 + 53816 + 65632 + 2385 + 7168
[perl 4.36]
+ + + + + + +
= = = = = = =
The binary sizes above are not the typical cases: these are from another system
4 Sep 28 32768 Nov 16 49152 Nov 16 114688 Oct 20 155648 Nov 16 155648 Nov 16 221184 Nov 16 311296 Jan 27 958464 Nov 2 1196032 Sep 14
/usr/local/bin/awk -> gawk /usr/bin/grep /usr/bin/sed /usr/local/contrib/gnu/bin/grep /usr/bin/awk /usr/bin/nawk /usr/bin/gawk /usr/local/bin/gawk /usr/local/contrib/bin/perl /usr/local/bin/perl
Hm. Can we draw some conclusion? Not anything definitive, but at least something: While sed(1) and grep(1) may be bigger than awk(1) in some systems, this is an exception. They are usually much smaller. It's more effective to use one awk process instead of many combined filtering commands. Complex commands that would require many processes to be chained together, like `grep -v | grep | sed' could be usually accomplished with one awk(1) call. Ask somewhere how to do it with awk(1) if you don't know the language, it's quite alike perl(1) Try to use standard awk(1). gawk(1) and nawk(1) are bigger and may not be found on all systems. Avoid perl(1) at all costs; it's many times (6) bigger than awk(1). Perl is slow-to start up, due to intermediate compilation process at startup and hogs oodles of memory. Remember that if procmail is running in a dedicated mail host, it probably doesn't even have any goodies installed, just the boring standard versions; which may not be even the same as what you see on current host. Here are some more programs. Don't even think of extracting fields with grep or awk, like "grep Subject", because formail is much smaller and more optimized for tasks like that. Better yet, many times you can do all with procmail's regexp matches.
37007 28672 20480 20480 20480 16384 16384 16384 16384 16384 12288
Sep Jun Jun Jun Sep Jun Jun Jun Jun Jun Jun
5 15:53 /usr/local/bin/formail 10 1996 /usr/bin/tr 10 1996 /usr/bin/tail 10 1996 /usr/bin/cat 26 1996 /usr/bin/expr 10 1996 /usr/bin/head 10 1996 /usr/bin/cut 10 1996 /usr/bin/date 10 1996 /usr/bin/uniq 10 1996 /usr/bin/wc 10 1996 /usr/bin/echo
# 3.11pre7
# # #
:0 fhw | $FORMAIL -r
When you port your .procmailrc to different environment which has different paths, you could use this recipe in addition to one just mentioned above:
FORMAIL
= ...as above
:0 * HOST ?? second-host { # In this host the paths are different. Reset. $FORMAIL $GREP $DATE } = "formail" = "grep" = "date"
SPOOL # # # # # #
Backup storage - This could be directory too. In that case you could use cron job to expire old messages at regular intervals - For once a day expiration, see procmail module list and pm-jacron.rc = $SPOOL/junk.bup.spool
BUP_SPOOL :0 c: $BUP_SPOOL
Naturally you can filter out mailing list messages from the backup, because losing one or two (hundred) of them may not be that serious. Maybe you could use two backup spools, one for mailing lists and the other for your non-list messages.
:0 c: * ! mailing-list1|mailing-list2 $BUP_SPOOL
If you have the date variables set up as described below, you could also create a backup folder per day:
$BUP_SPOOL
= $SPOOL/junk.bup.$YYYY-$MM-$DD.spool
This makes it very easy to delete backups that are older than a given number of days, either manually or through a cron job.
w i t h
" M y
a p o l o g i e s ,
t h e
s c r i p t
h a d
a n
e r r o r ,
i t
w o n ' t h a p p e n t h a t a r e n o w
a g i n " t o a l l a d d r e s s e d t o
t h e v a l i d y o u .
a n g r y
m a i l s
Drop the UBE to a folder, manually select the messages that need actions and send message to postmasters in the Received chain explaining that their mail relay has been hijacked.
:0 aAeE HBD fhb wWir c: LOCKFILE | | | | | | | | | (c)ontinue or (c)lone flag last. | | | (w)ait and other flags | | (f)ilter flag and to filter what: (h)ead or (b)ody | (H)eader and (B)ody match, possibly case sensitive (D) | Note: Procmail 3.22 bug | <http://mailman.rwth-aachen.de/pipermail/procmail/2002-February/008355.html> The `process' flags first. (A)nd or (E)lse recipe
:0Afhw:$MYLOCK$LOCKEXT
Or, as suggested, leave flags in their own slot for more distinctive separation. Note that procmail variable $LOCKEXT must be next to $MYLOCK, because it contains string ".lock".
:0 A fhw: $MYLOCK$LOCKEXT
:0 B * body-check-here
:0 * B ?? body-check-here
:0 fb | false
:0 fbW | false
[stephen] No, not on all occasions. Procmail will not care about the exit code here. However, if procmail detects a write error, it will recover (because of the missing i flag). Procmail will only detect a write error in such a case if the mail is long enough and does not fit in the pipe buffer that's in the kernel (typically 10KB).
:0 hwc: headc$LOCKEXT * !^FROM_MAILER | uncompress headc.Z; cat >> headc; compress headc
[david] Of course the f flag is enough to make procmail wait for the filter to finish, but the w means something more: to wait to learn the exit code of the filtering command. If sed fails with a syntax error and gives no output, without W or w procmail would happily accept the null output as the results of the filter and go on reading recipes for the now body-less message. On the other hand, with W or w sed will respond to a non-zero exit code by recovering the unfiltered text.
:0 h * condition /dev/null
[philip] Procmail knows that it shouldn't create a local lock on /dev/null and that it shouldn't kernel lock /dev/null, and it knows to write it "raw" (no "From " escaping or appended newline). This means that procmail simply opens /dev/null, does its write with one system call, and closes it. I'm not sure if adding the h flag makes a real difference on modern UNIX kernels. I suppose it depends on how optimized the write() data is and in particular, whether a user-space t o kernel-space copy is required, or whether it's delayed. If it's delayed then the code for handling /dev/null would presumably not do it, and the size of the write wouldn't actually matter.
[FAQ] The following will work some of the time, when the message is short enough, but that's a coincidence. With a longer message, though, Unix starts paying attention to what is happening, because it will have to buffer some of the data, and then when the buffered data is never read, an error occurs. The error is passed back to Procmail, and Procmail tries to be nice and give you back your original message as it was before this malicious program truncated it. Never mind that in this case you wanted to truncate the data. Anyway, the fix is easy: Just add an :i flag to the recipe ( :0fbwi instead of :0fbw) to make Procmail ignore the error.
[dan] here's why the i flag is needed (courtesy of Stephan): You told procmail to filter the entire mail (header and body), so it does and it attempts to write out header and body to the filter. Then procmail notices that not the entire body is being consumed. Procmail, being rather paranoid when it comes to delivery of mail assumes something went wrong and considers this a failure of the filter.
:0 fbwi | head -2
5.9 Flag r
[philip] Procmail automatically turns on the r (raw mode) flag for deliveries to /dev/null, so there's no
need to do it yourself.
[david] You can use the r flag (for raw mode) on every recipe where you do not want a From_ line added. I'm assuming that there isn't one already there; the r flag keeps procmail from making sure that there are a From_ line at the top and a blank line at the bottom, but it will not make procmail remove them if they are already present. Also, be careful to use the -f option on all calls to formail so that formail won't add a From_ line. Someone who didn't need From_ lines I forget who found it annoying to put r onto every recipe and altered the source to prevent procmail from adding From_ lines at all, ever. I think a better idea would be a procmailrc Boolean to enable or disable them for all recipes witho ut affecting other users. (Then perhaps we'd need a reverse r flag to undo raw mode for one recipe at a time?)
[david] Precisely: when you have braces, thinking "continue" instead of "copy" or "clone" can get you into trouble. Early versions of procmail, before braces and before cloning, called the c flag "continue" in their documentation; I think it is still called that in the source. When Stephen introduced braces (but not cloning at this point), it was of course implicit that an action line of "{" was non-delivering, and a c was extraneous. People put c's there because they wanted procmail to continue to the recipes inside the braces on a match, and procmail brushed it off with an "extraneous c-flag" warning. No harm done. When Stephen introduced cloning, though, I was rather upset that he was giving double duty to c instead of introducing something new like C for it, especially because people who absolutely wanted no clone but intended the recipes inside the braces to run in the same invocation of procmail as everything else were mistakenly putting c's on their braces to make sure procmail would "continue". People would (and did) get double deliveries. Roman Czyborra, though, said that if you consider c to stand for "copy", that covers both uses of c: provide a copy to a simple recipe or, if there are braces, to a clone procmail that will handle the recipes inside the braces. Stephen agreed and changed the documentation accordingly. Longtime users of procmail and people who read old docs may still think of it as "continue", but since the introduction of clones, that is not a good way to look at it. "Copy" is much safer.
... I run she ll commands that ne e d not to be se rialize d, so inste ad of doing the standard way:
:0 hic | command
I assume I can avoid the e xtra fork cause d by (c)lone flag altoge the r by using the se . Any diffe re nce be twe e n the se two?
If it's a simple mailbox deliver, pipe, or forward action then procmail does not fork a 'clone' (for pipe and forward actions procmail does have to fork, but only so it can execute the action). nbr.1 and nbr.2 take the same number of forks to execute. They also take the same effective number of writes (in case you're concerned about that). The latter also requires that procmail wait for the command to finish. nbr.3 is worse than the above two, as procmail has to not only wait for the command to complete but also save the output into the named variable.
:0 $FLAGS { do-something }
[david] HB AaEe and D affect the conditions and thus are meaningful when the action is to open a brace. HB and D would be meaningless, of course, on any unconditional recipe, but they should not cause error messages. Generally, flags that affect actions are invalid there, and bhfi and r always are, but the others are partial exceptions: if you are using c to launch a clone, then w W and a local lock file can be meaningful. If there is no c, then w W and a local lock file are invalid at the opening of a braced block.
E = try this recipe if the conditions have failed on the most recent recipe at that nesting level that did not have an E and on since then every recipe at that level that did have an E; essentially opposite of A These mnemonics might help: A: if you did the recipe at the start of the chain, try this one (A)lso a: if the last action at that nesting level was (a)ccomplished) e: if the last action at that nesting level (e)rred E: (E)lse because the conditions down the chain so far have not matched. Or "try this recipe unless the last tried recipe matched".
:0 : /etc/hosts/foo
:0 A * -1^0 /dev/null
# no match
:0 e # this is skipped because the last tried recipe didn't match { ...whatever }
How they interact with one another when used consecutively has not been fully tested to my knowledge. Consider this:
Is action3 done if action2 failed or if action1 failed (or perhaps in both situations)? [philip] Action 3 is only done if action2 failed. If the answer is action2, does this work to get action3 done if action1 failed? I think it does, but does it also run action3 if the conditions didn't match on the first recipe? [philip] Yes, and yes.
If the conditions match, action1 will be executed. action3 will then execute if action1 failed, otherwise action2 will be executed [if action1 succeeded]. [david] I know what this structure does because I use it:
:0 * conditions non-delivering action1 :0A action2 :0E non-delivering action3 :0A action 4
If the conditions match, action1 and action2 are performed and action4 is not (of course action3 is not either), even if action2 is non-delivering; if they fail, action3 and action4 are performed. The A on the fourth recipe refers back to the third and no farther. But I don't know about this:
:0 * conditions non-delivering action1 :0A * more conditions action2 :0E non-delivering action3 :0A action 4
Now, suppose the conditions on the first recipe match but those on the second recipe do not match. Would the third recipe (and thus the fourth one) be attempted? I would expect so. [philip] Yes. The last tried recipe didn't match, therefore the E flag will be triggered. If that isn't what you want, you can prevent it this way:
:0 * conditions
{ :0 non-delivering-action1 :0 * more-conditions action2 } :0 E # ignores mismatch inside braces, looks only at same level non-delivering action3 :0 A action4
# if action2 is non-delivering or vulnerable to error that # would cause fall-through DID2 # Kill variable
:0 * conditions non-delivering-action1 :0 A action3 :0 * ! DID2 ?? (.) non-delivering-action3 :0 A action4 # if action2 is delivering and sure to succeed :0 * conditions non-delivering-action1 :0 A * more-conditions action2 :0 non-delivering-action3 :0 A action4
[philip] or those who are interested, I'll note that there are only 3 combinations of the a, A, e, and E flags that aren't either illegal or redundant. They are Ae, aE, and AE. I've shown a use for Ae up above. Here's an example of AE:
:0
action3 will only be executed if condition1 matched but condition2 didn't match. Without the A flag, action3 would be executed if either of them failed. This can also be done with a instead of A with analogous results. Procmail's "flow-control" flags may not be particularly easy to describe in straight terms (and this can all be made more complicated by throwing in a more varied mix of delivering vs non-delivering recipes), but I've found that it usually does what I expect it to do, and when it doesn't or I'm in doubt or I want to be particularly clear, I can always fall-back to doing it explicitly via nesting blocks. Pick your poison...
People who are in favor of writing pure native regexps in the recipes:
]<[
]*("([^"\]|\\.)*"|[-!#-'*+/-9=?A-Z^-~]+)...
# "
They think: I'm not planning on "maintaining" that code, as the syntax for XXX will not ever change, it's RFC or something. I somehow doubt that anyone else will change that regexp more than trivially If none of your other regexps use the categorical variables, and you're not changing the regexp, then what's the point? The variablized version will be slower, and will clutter the environment with subprocesses. Where someone that immediately wants to abstract things says (this is from philip's great Message-Id matching recipe)
dq = '"' # (literal) double-quote bw = "\\" # (literal) backwhack atom = "[-!#-'*+/-9=?A-Z^-~]+" word = "($atom|$dq([^$dq\]|$bw.)*$dq)' local_part = "$word($s\.$s$word)*" $s<$s$local_part... # ignore comment here
...abstraction: It makes code clearer when you break it to manageable parts, which possibly surfaces reusable parts. It also makes thing look simpler, and enables even novices to understand what's going on there. After we're not connected to the net anymore, others could possibly understand it too. So, naturally we can't agree with any of the previously mentioned arguments presented for keeping regexp "in pure native format".
Although you won't maintain it, it's an example for others. What you post first, people will save it to their mailboxes and circulate elsewhere in the net: "Hey, I've saved this, try it" You can write cryptic regexps or break them into parts where the whole looks much simpler. Consider novice's welfare :-) This has nothing to do with the "It never changes in my lifetime". The speed penalty imposed by additional variables is not something we can measure in practice. CPU won't even hiccup. An extra formail call in your recipes is 10x as expensive as 100 variables. (I don't know how to measure that, but launching a shell and creating a process is a much more expensive task). Cluttering the env process? C'm on. That won't matter either. No outside process use lowercase environment variable names, or then it must be real special program. So called "cluttering" of environment space is also no-issue. CPU won't even get a hiccup for that.
If you put a $ after the \/ match token then procmail will include the matched newline if there's one there. Solution? Don't put a dollar sign there unless you really want a newline, use period that matches all but newline:
:0 * B ?? ^Search-string: \/.+
Received: from unknown (HELO Desktop01) (208.11.179.72) by palm.bythehand.net with SMTP; 4 Dec 1997 23:29:09 -0000 :0 * ^Received:.*bythehand\. { # Do something } # note, match on continuation line
But using the space+tab is not very readable and it's a very error prone construct. Here is a suggestion to use variables to improve the readability:
WSPC SPC
NSPC
= "[^$WSPC]"
# # # # #
whitespace = space + tab regexp whitespace, the short name SPC was chosen because you use this a lot in condition lines. negation of whitespace
:0 *$ var ?? $NSPC { # match anything except space and tab } :0 *$ ! var ?? ($SPC|$) { # match anything ecxept space and tab and newline }
WSPCL ' #
= "
"'
*$ var ?? [$WSPCL]
SPCNL = "($SPC|$)"
If you absolutely need a range of characters, see if you have echo command in your system to define variables like this:
Therefore, a leading '!' must either be backslashed, enclosed in either parens or brackets (I suspect that parens would be more efficient), or prefaced with an empty pair of parens. I would recommend writing the condition with one of these:
[elijah] If you are inverting a character class "first" means just after the(^). So the character class that contains everything but ] ^ and - must look like this:
[^]^-]
[david] What if I want literal $ inside bracket? A $ inside brackets, unless it begins a variable name and the "$" modifier is on, always means a literal dollar sign. It cannot mean a newline if it appears inside brackets. A good way to keep it exempt from "$" interpretation is to put it last inside the brackets (unless one also need to include a literal hyphen and one can't put the hyphen first; then you'll need to escape the dollar sign with a backslash and put the hyphen last well, you could alternatively escape the hyphen, I guess), because procmail knows that "$]" cannot possibly be a reference to a variable. General guideline: ($) always matches a newline, with or without "$" interpretation; [$] always matches a dollar sign, with or w/o "$" interpretation;
* * * * *
rest of string .* rest of string[ ] (rest of string ) rest of string () rest of string( )
[philip] From my looking at the source, the last two should be equal in efficiency, and except for a trace difference in regcomp time, should match at the same speed as a solitary trailing blank. The character class version [ ] will be slower. Of course, I suspect that neither you nor your sysadmin will ever notice the difference in speed, and given that 99% of all systems are I/O bound and not CPU bound, the system is incredibly unlikely to notice either. I can't complain though, as I also go to various extremes to seek out every last bit of possible performance. Ah well. The first one would be slower yet, though perhaps no slower than the bracket form.
* ! B ?? \<word\>
[david] You have fallen into the leading backslash problem, If the first character of a regexp is a backslash, procmail takes it as "end of leading whitespace" and strips it. What you coded means "a less-than sign, then the word, then any non-word character." (It also prevents the less-than sign from being taken as a size operator.) Unless the non-word character immediately to the left of the word was a less-than sign, that regexp would fail (and thus the condition would pass). Try this:
* ! B ?? ()\<word\>
* ! B ?? \\<word\>
but in a casual reading it would look like "literal backslash, less-than sign, the word, word boundary character," so we on the list generally recommend the empty parentheses. Do note that the difference in meaning of \< and \> in procmail (where they must match a non-word character) from their meaning in perl and egrep (where they match the zero-width transition into and out of a word respectively) does not come into play here. Because procmail's \< and \> can match newlines (both real and putative), it rarely is a factor. It's a problem only when a single character has to serve both as the ending boundary of one word an also the opening boundary of another. Well, it's also a problem when you have one as the last character to the right of \/, but that's easily solved.
Procmail does this when it reads in the head, not when it goes to search the body, so that cost can't be avoided. Let me repeat; that searching the body is no slower than searching the header, if we forget the minimum impact of the size of these two.
^Subject:.*\<humor\> ^Subject:(.*\<)?humor\>
*, ?, and + in the absence of \/ are stingy rather than greedy, and that generally won't matter, but in the presence of \/ they are stingy to the left of \/ and greedy to the right of \/, while in most applications the leftmost wildcard on a line is the greediest and greed decreases from left to right.
[philip]...and this won't quite work. For a subject with a space after the tab, the '*' on the left hand side will be matched minimally (zero times), and then the stuff on the right hand side will be matched maximally, but starting at the space still, which will match nothing. This is a case were procmail's minimal matching can cause massive confusion and frustration. The solution is usually the following:
it'll work, because then the left hand side will have to match all the way up to the first digit (but not the digit itself). If you follow the rule in caps then you'll almost always be able to ignore procmail's weirdness in this area.
The right side was as greedy as it could be; the problem is that we seem to expect greed on the left as well. MATCH is set to null, in contrary to our expectation. It is not a bug but rather a frequently misunderstood effect of the way extraction is advertised to operate.
Re me mbe r that only the right side is gre e dy; the le ft side is stingy, and le ft-side stingine ss take s pre ce de nce ove r right-side gre e d.
Extraction is implemented this way: the entire expression, left and right, is pinned to the shortest possible match; then the division mark is placed and the right side is repinned to the longest possible match starting at the division. The tricky part is to remember that the division is marked during the stingy stage. If the expression is
^Subject:.*Keywords.*\/[0-9]*
<newline>Subject:<space>Keywords<space>9999<newline>
<newline>Subject:<space>Keywords
because ".*" and "[0-9]*" both match to null. Then the division mark is placed on the space after "Keywords" and procmail looks for the longest possible match to [0-9]* starting with that space. That, again, is null, so MATCH is set to null. We see that it works as expected if regexp is changed to this:
^Subject:.*Keywords.*\/[0-9]+
That is a whole other ball of wax. Now the shortest match to the entirety is
<newline>Subject:<space>Keywords<space>9
and the division mark is placed at the 9. Then procmail refigures the longest match to the right side starting at the division mark and sets MATCH=9999. However here
^Subject:.*Keywords\/.*[0-9]*
the second ".*" would have reached not just up to the digits but through them to the end of the line. MATCH would contain the rest of all of it matched to ".*" plus null match "[0-9]*". [for curious reader] Given line
the second, which differs only by inserting the extraction marker, would not match and would not set $MATCH:
# matches ok # won't !
because the left side would be matched to "<newline>Subject: Keywords" and the immediately following text, " 9999", did not match the right side. It would actually make the condition fail and keep the recipe from executing. It took a lot of circuitous coding to allow for not knowing in advance exactly how many spaces there would be before the digits.
Call it counte rintuitive , but it's not a bug. Ge ne ral advice : always make sure that the right side cannot match null or that the last e le me nt of the le ft side cannot match null. Or in othe r words: force the right-hand side of the \/ to match at le ast one characte r.
[david] \/ with nothing to the left of it means "one foreslash". To start a condition with the extraction operator, use ()\/ or \\/; the latter looks counter intuitively like "literal backslash and literal foreslash" (as it would mean if it appeared farther along in the regexp), so most of us prefer the former.
# ok, \/ in the middle # Wrong, when \/ is at the beginning # No ok, () at the beginning
# # #
Body consists entirely of HTML code something which'll match any message which has "<HTML>" in the body
:0 : *$ B ?? $s*<HTML> HTML.mbox
The condition test is applied to the entire body. If you want to limit it to match only against the beginning of the body, you have to say so using the ^^ token, as you discovered. A simple line anchor (^ or $) just says that there must be a newline (or the beginning or end of the area being searched) at that particular point in the text being matched. notice the leading anchors below.
# #
trap spam where the *very* first line of the body started with <HTML>
:0 : *$ B ?? ^^$s*<HTML> HTML.mbox
What, e xactly, doe s "Anchor the e xpre ssion at the ve ry start of the se arch are a..." i.e . the ^^ ?
[dan] Technically, an opening ^^ anchors to the putative newline that procmail sees before the first character of the search area (and a closing ^^ anchors to the putative newline that procmail sees after the end of the search area). When the search area is B, that is a point equivalent to the second of the two adjacent newlines that enclose the empty line that marks the end of the head. The reason I'm bringing that up is this: if there are multiple empty or blank lines between the head and the body, ^^ will mark the start of the second of those lines, not the start of the first line of the body that contains some text. So if you want to test whether <pattern> is the first printing text in the body, even if it is not necessarily flush left on the very first line, you might need a condition like the following, where there is space/pipe/tab/pipe/dollar.
*$ B ?? ^^$SPCNL*<pattern>
* condition1 * condition2
condition1|condition2
Likewise, two exit code tests can often be ORed like this
* ? command1 || command2
But there are many situations where two tests cannot be ORed by combining them into one condition: a regexp search of one area ORed with a regexp search of a different area a positive regexp search [i.e., for a match to its pattern] ORed with a negative regexp search [i.e., for the absence of any match to its pattern] an exit code condition ORed with a regexp search condition
an exit code condition seeking success ORed with an exit code condition seeking failure a size test ORed with anything else (even another size test)
How can I make OR conditions that all use the S AME action? I want to be able to te st for a numbe r of variants on ce rtain re que sts, all in one block.
CASE = "" :0 * case 1 tests { CASE = 1 } :0 E * case 2 tests { CASE = 2 } :0 * ! CASE ?? ^^^^ { # real work, perhaps with explicit tests on CASE }
Case study: Finding text from header and body [david] In addition to the standard ways of coding OR, here's a special one for searching the subject and the body for a given word in either:
* HB ?? ^^(.+$)*(Subject:(.*[^a-z0-9])?|$(.*\<)*)remove\>
If the string doesn't have to be preceded by a word border, it gets a little simpler:
* HB ?? ^^(.+$)*(Subject:.*|$(.|$))*string
We can now write the previous case stydy (HB ORing traditionally) with scores. I was tempted to write it like this, when [david] told me the following.
[david] That will work, but it isn't the best way to do ORing, because if a match is found to the first condition procmail still takes the trouble to test the second one. Better, use the supremum score on each condition:
$SUPREME = 9876543210 *$ $SUPREME^0 first_condition_to_be_ORed *$ $SUPREME^0 second_condition_to_be_ORed * ... etc. ... *$ $SUPREME^0 last_condition_to_be_ORed
Upon reaching the supreme score, procmail will skip all remaining weighted conditions on the recipe, deeming them matched. Since all conditions on this recipe are weighted, once procmail finds one matched condition it will skip the rest and execute the action.
a or b
is same as
or mathematically:
a || b <=> !( !a && !b )
7.0 Variables
7.1 Setting and unsetting variables
You have already set variables with the "=" syntax. Variable names are case sensitive: var is different from VAR
# directory # literal
VAR = 1 VAR = $FOO # another. VAR = "$VAR at" # combined with previous value
# kill variable. # same, but with old style # Variable is said to be "null" now
And you can put multiple assignments on the same line, although not recommended:
VAR=1
VAR=2
VAR=3
Examine the following, which are all equivalent. The back ticks will not require a shell in the absence of any SHELLMETAS so neither of these will spawn a shell
:0 * condition { VAR = `cat file` } # # # case3: oldish, and procmail specific and errors have been reported if you use this construct. Note: There must be no space in "VAR=|"
VAR = ${VAR:-"yes"}
VAR = ${VAR+"yes"}
VAR = ${VAR:-`date`}
No, procmail is smart enough to skip calling date if VAR already had value. It doesn't evaluate the whole line. Below you see what each initialising operator does. Study it carefully
= = = = = = = =
# Note these: VAR = "val" VAR = ${VAR:+"value3"} VAR = "val" VAR = ${VAR+"value4"}
VAR VAR = ${VAR:-"value1"} VAR VAR = ${VAR-"value2"} VAR VAR = ${VAR:+"value3"} VAR VAR = ${VAR+"value4"}
And if you want to choose from several initial values, you might use the recipe below instead of the standard var = ${var:-"value"}.
:0 * VAR ?? ^^^^ { # no value (or was empty), set default value here based on # some guesses VAR = "base-default" :0 * condition { VAR = "another-default" } ...more conditions..
You could also use equivalent, but less readable condition line in previous recipe:
*$ ${VAR:+!}
* !
Where "!" is the procmail "false" operation. One more way to do the same would be, that we require at least one character to be present. You could use also regexp (.), which would require at least one character to be present, but you might not like matching pure spaces.
* ! VAR ?? [a-z]
* ! TEST_FLAG ?? yes
TEST_FLAG ?? no
Using literal strings like "yes" and "no" might present more clear though what is going that a traditional "!" negation of a test. Note, that the following fails if the variable is unset or null.
* variable ?? (.)
*$ variable ?? $NSPC
Or
* variable ?? (.|$)
to require that variable contain at least one character. But neither is a way to check whether a variable is set o r not, because each treats a null variable the same as an unset one. This is the best
*$ ! ${VAR+!}
[<gsutter A T pobox.com>] Here is yet another way to test if variable is set and if it isn't, sets it to a default value.
[era] Note that this is slightly inexact; Procmail will backslash-escape according to Procmail's needs, not sed's. For example, Procmail doesn't think braces are magic (although that would be nice to have in Procmail as well) whereas many modern variants of sed do.
# Erm, this is ok, but many procmail recipe writers want to # take extra precautions and include the regexps in parentheses. # So, maybe (yabba|dabba|doo) would be more safe REGEXP * *$ = "yabba|dabba|doo" # Hey, you need the "*$ Subject..." # surely you meant '* REGEXP ?? hello'
# won't extrapolate $VAR; you get literal # extrapolates to: hey 'you'
Don't let these many quotes disturb you, just count the beginning and ending quotes. Superfluous here, but you may need some similar construct somewhere else.
Procmail translates ! into | "$SENDMAIL" "$SENDMAILFLAGS" as the procmailrc(5) man page warns us. By the rules of sh quoting, that means that shell sees only the first switch
% sendmail -oQ/var/mqueue.incoming
My suggestion: since you need a soft space inside $SENDMAILFLAGS, use the quotes when you define $SENDMAILFLAGS but do this instead of using the ! operator for forwarding:
[Walter Haidinger <walter.haidinger A T gmx.net>] Here's yet another approach: deliver messages from procmail directly to mailboxes in all those users' homes. No sendmail involved, much lower loads.
[philip] Assuming that "someuser" is an actual user in the password file (I haven't been following this thread, some maybe that isn't true here), then the following is probably better:
Walte r Haidinge r comme nts on this re cipe : I 'm happy to announce that this works re ally we ll. No harm is done to the syste m-load anymore . What a re lie f!
That lets procmail's very tricky "screenmailbox()" routine take care of bogus mailboxes in a secure fashion.
I s that as safe as forwarding? Doe s anothe r se ndmail de live ring to /var/spool/mail/some use r use the same locking me chanism and notice that mailbox is alre ady locke d? I don't want to risk a corrupt mailbox.
[philip] Sendmail only delivers directly to files through aliases that say things like:
whatever: /some/local/file
Under normal circumstances, sendmail calls the local mailer to actually store mail in a file, and since that's procmail (right?), there shouldn't be a problem. Also, sendmail 8 does kernel-level locking when it delivers directly.
There is also another way. If your script can access environment variables (almost all programs can), then you do not need to pass the variables on the command line. Above, the SUBJECT is already in the environment and in Perl you can get it with:
$SUBJECT = $ENV{SUBJECT};
Next, do you know what is the difference between these two recipes?
You guessed it. The first one quotes the entire command and does not do the right thing, the latter is correct and depending on the content of argN variables. Anyway, play safe and always add quotes.
Sometimes you need trickier quoting to to get single quotes around the arg. Pay attention to this, because this may be the reason why your grep command doesn't seem to succeed as you expect.
var = `command`
# capture STDOUT
But if a program modifies the body and exports some status information it is trickier. We assume here that the script is controlled by you and that you have added the switch --export-status option which causes the program to print information to a separate file.
LOCKFILE valueFile # # # # # #
= $HOME/.run$LOCKEXT = $HOME/tmp/values
modify body, and export status values to external file: one value in every line VALUE1 VALUE2 VALUE3
:0 fb | $NICE script.pl --export-status $valueFile values = `cat $valueFile` # Derive values from each line :0 *$ values ?? ^^\/[^$NL]+ { var1 = $MATCH } :0 *$ values ?? ^^.*$\/[^$NL]+ { var2 = $MATCH } :0 *$ values ?? ^^.*$.*$\/[^$NL]+ { var3 = $MATCH # line 1
# line 2
# line 3
[richard] Alternatively write valueFile from your rc or external program with lines like
PARAM1="value for param 1" PARAM2="value for param 2" PARAM3="value for param 3"
INCLUDERC $valueFile
Now there is no need to worry about synchronizing the read with the lines, or about adding new parameters, since each is labeled in valueFile.
:0 *$ $VAR ^0 *$ $N ^0 { } VAR = $=
# procmail no-op
[idea by era] it's getting slightly cumbersome if it's between MIN and MAX:
:0 *$ *$ {
$SCORE ^0 -$MIN ^0
Assigning "SCORE=4" Score: 4 4 "" Score: -1 3 "" Assigning "dummy" Score: -4 -4 "" Score: 5 1 "" Assigning "suitable"
# now 34567890
but deleting 2 characters from the end is nearly impossible without forking an outside process. The cheapest might be expr because it doesn't need a shell to pipe echo to it (as sed would and I believe perl would):
# # #
by resetting the shellmetas, this will only call `expr'. If we wouldn't have fiddled with shellmetas, this would have called two processes: sh + expr = = = $SHELLMETAS `expr "$VAR" : '\(.*\)..'` $saved # now 12345678
# #
semicolon to force invoking a shell, actually first question mark will force a shell already. = = = = $SHELL /bins/sh `echo ${VAR%??} ;` $saved
Now, if you know that the last two characters will be "90", that's different. Of course, this totally screws up if the third-to-last character is a 9.
# now 12345678
[jari] Comments: If a shell must be used, then awk is a good tool for simple string manipulation. Its startup time is faster that perl's whose overhead is due to internal compilation. awk also consumes less recourses overall than perl. Following will only work if VAR is a string of continuous block of
saved SHELLMETAS
$SHELLMETAS
VAR = ` awk 'BEGIN{ v = ARGV[1]; print substr(v,1,length(v)-2); exit }' "$VAR" ` SHELLMETAS = $saved
\ \ \
This version requires some file, any file, so that we get awk started. In the previous code all the work was done in the BEGIN block and no file was ever opened.
saved SHELLMETAS
$SHELLMETAS
VAR = ` awk '{print substr(v,1,length(v)-2); exit }' v="$VAR" /etc/passwd ` SHELLMETAS = $saved
\ \
[dan] comments awk: expr is sure to be a smaller binary than awk for procmail to fork, and it needs much less command-line code to do this job. Note also that one still has to diddle with SHELLMETAS to avoid a shell, because the awk code contains brackets; thus it doesn't replace all. There is also a way to remove words from the end of string by procmail means if the strings are separated by same separator. Let's use the word this-mailing-list-request which we would like to shorten to this-mailing-list. [david] presented the recipe 1998-06-16 in PM-L.
VAR = "this-mailing-list" # # # 1) if there is match at the end ending to these words 2) Get everything up till last match and store it to MATCH 3) Read MATCH, but exclude last dash "-"
VAR RC_APPEND
# Get the longest match that does not end in the TAIL character :0 *$ VAR ?? ()\/.*[^$TAIL] { HEAD = $MATCH # now "Testing 012301230" # # if the last two or more characters in VAR are identical, they all get chopped, oops
:0 * -1^0 * 1^1 VAR ?? (.) * -1^1 HEAD ?? (.) { dummy = "tooshort" INCLUDERC = $RC_APPEND } } } result = $HEAD # "Testing 01230123011"
# ........................................ pm-myappend.rc # LENGTH(HEAD) plus 1 SHOULD equal LENGTH(VAR). That is # not the case when the last 2 (or more) ending # characters are identical. in that case, call appendrc # recursively to stick back an appropriate number of # TAIL characters. :0 * -1^0 * 1^1 VAR ?? (.) * -1^1 HEAD ?? (.) { HEAD = "$HEAD$TAIL" INCLUDERC = $RC_APPEND }
... :0 # Stop if previous cases filed the message *$ $FILED { HOST = "_done_" }
# kill variable
# Or
${LASTFOLDER+!}!
[david] An unescaped dollar sign later in the line represents a newline, so what you have there is searching for the following: 1. An expression that matches the expansion of the ^TO token (which is anchored to the start of a line by its definition), followed by 2. A newline, followed at the start of the next line by 3. "foo@bar" [the backslash escapes the f, which didn't need escaping], followed by 4. any character that is not a newline (the period is unescaped), and finally 5. "com". Try this instead:
*$ ^TO()$\foo@example\.com
#todo: the dollar seems exactly the same in the above two #todo Examples: are you sure that this is correct?
In fact, to avoid matches to things like [email protected], you might want to do it this way:
*$ ^TO()$\foo@example\.com\>
Your procmail runs /etc/procmailrc when it starts, please check that. It may define some common variables already for all users.
# Code by [era] # COMMAND='while read url; do case "$url" in *://*) lynx -traversal -realm -crawl -number_links "$url" | $SENDMAIL $LOGNAME ;; esac done'
# Notice the trailing semicolon after `eval' ! :0 bw * ^Subject: xxxxx | eval "$COMMAND" ;
If you want to run the code inside the nested block, then look carefully, there are double quotes around the command in back ticks. If you leave double quotes out, then each word in SH_CMD would be interpreted separately:
$SH_CMD = '$echo "$VAR" >> $HOME/test.tmp' :0 * condition { # condition satisfied; run the given shell command # and do something more. dummy = `"$SH_CMD"` ..rest of the code.. }
MESSAGE='Thank you so much for your message. Unfortunately, the volume of mail I receive .... (blah blah blah). If your matter is urgent, try calling +358-50-524-0965. ' :0 hw * ! ^X-Loop: moo$ | ($FORMAIL -r -A "$MYXLOOP"; echo "$MESSAGE") | $SENDMAIL
HEADER = `$FORMAIL -X ""` # The space after the X is vital. HEADER = `sed /^$/q` # also writable as HEADER=`sed /./!q` :0 h HEADER=|cat -
will save the entire header into one variable. It has to be smaller than LINEEBUF, though. This way might work as well, and will require no outside processes if it does:
But if you don't know the word or string beforehand, then this is the generalized way: [idea by era and david]
Multiple echo commands that spread many lines can be converted to single echo command if \n escape is supported. You usually see these in auto responders
echo "........."; \ echo "........."; \ echo "........."; --> echo ".........\n" \ ".........\n" \ ".........\n";
You can avoid multiple and possible expensive FROM_DAEMON tests by caching the result at the top of your .procmailrc. You can now use variable $from_daemon like the big brother FROM_DAEMON. The same idea can be applied to FROM_MAILER regexp. If you have pmjavar.rc, it already defines variables $from_daemon and from_mailer exactly like here:
Count the back ticks and you know how many shell calls procmail has to launch. See if you can minimize them and use some procmail code instead. ^TO and other macros are expensive, see if you can use simple Header:.*\<match-it\> instead. Well, it's not clear if this gives you much speed advantage. Don't call "$FORMAIL -xHeader:" every time you need a header value, consider if it suffices to use match operator \/. You can minimize the calls to only one formail if you add many headers along the way: See formail usage tips in this document
Searching body is expensive, simply because it contains more text. There isn't much to do about this, because you use B anyway when you need it. See if you can move some tasks to your .cron file. procmailrc is not meant for those purposes. Instead of calculation daily values every time in procmail, let cron do that at 04:00 or 21:00. Don't run cron at midnight if you can, because everybody else is running their crons at the same time. If "logical" date change time can be used (when you arrive to work, when you leave the work), use it in cron jobs. [philip] Setting LINEBUF permanently to a big value slows procmail down. Remove all calls to perl and use programs that are nicer to the system (If you just call command line perl, there is probably an equivalent alternative with awk tr sed cut) Examine each shell command and see if you do need SHELLMETAS. If you can set SHELLMETAS to empty, this saves calling "sh" for each invocation of the external command.
% ls /usr/local/lib/procmail-3.11pre7/examples/
Or if you're really anxious to get on your own, try this. The directory /opt/local is for HP-UX 10 machines and the forward contains example how to define your .forward for procmail.
If the find succeeded and found the file, then you know where the procmail files installation directory is.
# -m merges all error messages into a single line % mailstat -km procmail.log
[philip] Under the most likely configuration of sendmail in this situation, it is impossible to have procmail invoked by sendmail on the shell machine: sendmail is probably set to just forward all mail to the designated mail delivery machine. There are other options: you could temporarily store the mail in your account, then have a cron job on the shell machine that reprocesses the message. That would probably be more efficient than having each message trigger an rsh to the shell machine. If you actually get enough spam that it's pushing against your quota, then the rsh is too expensive use a cron job that invokes something like:
cd your-maildir && lockfile spam.lock && test -s spam && { cat spam >> /tmp/spam.box && rm -f spam spam.lock || \ rm -f spam.lock; }
WARNING: the above assumes the following: everything in your-maildir/spam is spam and belongs in /tmp/spam.box no further filtering of the messages is necessary: they just need to be moved (it actually treats everything in the your-maildir/spam as a single message and uses procmail as a reliable copy command, thus the DEFAULT assignment as the use of /dev/null as a empty procmailrc) /tmp/spam.box is a not a directory If the latter two of those conditions isn't true OR IF THEY MIGHT CHANGE then you should use formail -s to break the message apart and invoke procmail on each one separately. [era] Many sites cross-mount directories for various reasons. /tmp is always local but /var/tmp might be cross-mounted between the login host and the mail host; another one to try is /scratch and if all else fails, ask your admin to set up an NFS share for this purpose.
The skipping of whitespace at the beginning of the message is of course not necessary. You should probably set LINEBUF reasonably high if you grab many lines, say 30: 80*30 = 2400 bytes; probably setting it to 8192 or 16384 is a good idea, depending how much you want to match. The above gets ugly quickly, so
\ \
I starte d out with spam.rc from "arie l" which got me into the habit of
although I note that spam.rc did have one re cipe using the e cho me thod. What are the re asons for choosing e ach me thod ove r the othe r?
Here is a comparison table. Choose the one you think is best for you Echos don't have dependency on an external file: everything is contained in the .procmailrc file. Echos keep all the relevant stuff in one file. Cat's make you maintain multiple files. That's the main reason I lean toward echo's; you may have accounts on several machines. It is easier to be able to copy just one generic .procmailrc between them without having to copy a bunch of messages also. Mostly, though, there's no real difference between the two methods. Echo is easier to use with variables. Echo starts many processes, cat only starts one, but this is not always true: In most current Bourne shell implementations, echo is a built-in. This holds true with tcsh too. The main problem I see with the use of cat is "what happens when you forget the file or destroy it ?". I suggest to, at least, test that the file is readable before catting it. [richard] An argument against echo is that it is not well standardized, and different versions may exist on the same machine. Some recognize -n, some don't; some recognize embedded metacharacters, some don't.This is an argument in favor of print. Print, however, is not a built-in on all systems. The comment on built-ins is pertinent to situations when a shell is spawned. When procmail handles the call directly, it will always look for a stand-alone executable. I guess echo may be better, as long as we are aware of any differences in behavior between built-in and stand-alone versions.
# case 1: print to BiffLog dummy = `echo "message: $FROM $SUBJECT" >> $biff`
[david] Problems you get no locking on the destination file, and unless you put it inside braces you have to run it on every message unconditionally. (Also procmail tries to feed the whole message to a command that won't read it, but the remedies for that don't help very much.)
# case 2: We consume delivering recipe and therefor have to use # `c' flag. :0 whic: | echo "message: $FROM $SUBJECT" >> $biff
Here it locks the destination file an d you can add conditions to it, so it's probably the best. If the head or the body is less than one bufferful, you can limit the unnecessarily written data with h or b, but I think that in most OSes a partial buffer and a full one are the same amount of effort.
# case 3: We use side effect of "?" here. Cool, but this # doesn't do $biff file locking thus message order may # not be what you expect. :0 * condition * ? echo message: $FROM $SUBJECT >> $biff { } # procmail no-op
We have conditions possible, but there is no locking on the destination file. I'd go with method #2 or a variation thereof:
:0 hic: # we don't necessarily need `w' * condition | echo message: $FROM $SUBJECT >> $biff
:0 hi: # Or you could use this * condition dummy=| echo message: $FROM $SUBJECT >> $biff
[jari] Now, whe n [david] has e xplaine d how various ways diffe r from e ach othe r, I pre se nt the re cipe whe re I use d the case 3. Whe n I was dropping a me ssage to a folde r, I wante d to se nd a me ssage to my biff log too. The ide a is that the drop-conditions have alre ady matche d and the n we run e xtra command by using side e ffe ct of "?" toke n. As far as the re cipe is conce rne d, the "?" is a no-op. The pe dantic way would have be e n to add the LOCKFI LE around to the re cipe , but imagine 50 similar re cipe s like this...and you unde rstand why the LOCKFI LE was le ft out. I t's only ne ce ssary if you worry about se que ntial writing to the biff file .
* ? misbehaving-shell-script || true
The more complex case is a script that can return either success or failure but you don't care which; if the drop conditions passed, you want to run the action line. echo can also fail if the process lacks permission or opportunity to write to stdout. A more reliable choice is true(1); its purpose in life is to do nothing but exit with status 0. The command : is a shell built-in which always returns true status. Not exactly more readable than true(1) "|| :" will save the invocation of true (unless true is built into $SHELL), but procmail will still run a shell. On the other hand, as long as the command itself has no characters from SHELLMETAS a
weight of 1^1 and no "|| anything" will avoid the shell process as well. However, there is yet a better way to make sure that a failure by the script doesn't make procmail abort the recipe:
Regardless of the exit status of the script, the condition will score 1 and not interfere with procmail's decision about the action line of the recipe. Weighted exit code conditions behave like this (see the procmailsc(5) man page):
* w^x ? command
* w^x ! ? command
* w^x
pattern_that_appears_in_the_search_area_$?_times
= = = = =
"/bin/egrep" "/bin/sed" "/usr/bin/tr" $HOME/procmail/spam-regexp.lst `$TR '\n' '|' < $kwdfile | $SED -e "s/[ \t]+|/|/ ; s/|+$//" `
It is a little easier to check sender's address against a whitelist, because it is possible to use "word" based checking in contrast to regular expression checking aboce. Supposing that file contains known email addresses listed one at a time, the recipe recipe would be:
file = $HOME/procmail/spam.keywords searchFields = "-xSender: -xFrom -xFrom: -xReturn-Path: -xReply-To:" :0 w *$ $FORMAIL $searchFields | $EGREP --quiet --ignore-case --file='$file' { # This sender is known }
A word of caution: white list or black list based sender matching does not work 100%. The spammers hijack large amount of other people's email addresses which they ruthlessly use in identifying the message's sender. It is no surprise to receive a Unsolicited Bulk Email from friend he is not the real sender, but his address was drifted to spammers email database. 9.0 x1 8.11 Using dates efficiently
Note : S e e module list, whe re you will find date and time parsing module s. You can also parse the date from the first Received or From_ he ade r if it is the same e ach time in your syste m. That would be orde rs of magnitude faste r and de cre ase s your syste m load if you re ce ive lot of mail.
Calling date in your procmail script many times is not a good idea. Use the MATCH as much as possible to be efficient in procmail, like below where we call date only once. If you are not in the same time zone as your server, and you want an accurate report of the date, you might amend the invocation to the following:
# By [richard] add %H:%M%S if you want these as well :0 * date ?? ^^()\/.... { YYYY = $MATCH } :0 * date ?? ^^..\/.. { YY = $MATCH } :0 * date ?? ^^.....\/.. { MM = $MATCH } :0 * date ?? ()\/..^^ { DD = $MATCH } TODAY = "$YYYY-$MM-$DD" # ISO std date: like 1997-12-01
# this requires that HH and MM have been setup before, # see pm-jadate.rc NOW TODAY $NULL BIFF = "$HH:$MM" = "$YY-$MM-$DD $NOW" # the time only # ISO 8601: date and time # /dev/null is dangerous
= $SPOOL/junk.null.spool = $PMSRC/pm-biff.log
# If you prefer a log per day (easy for cleanup): # BIFF = $PMSRC/pm-biff.log.$YYYY$MM$DD # .............................................. headers ... # DON'T USE THESE: they call shell # # FROM = `$FORMAIL -zxFrom:` # SUBJECT = `$FORMAIL -zxSubject:` :0 * ^From:\/.* { FROM = "$MATCH" } # Use procmail match feature
:0 # Use procmail match feature * ^Subject:\/.* { SUBJECT = "$MATCH" } # ............................................. incoming ... # record log of incoming mail :0 hwic: | echo "$TODAY $FROM $SUBJECT" >> $BIFF # ......................................... null recipe ... # Now, this is how you add the "message" what happened # to that mail. See "?" shell call in the recipe :0 : * From:.*(remove|delete|free|friend@) * ? echo " [null-AddrReject]" >> $BIFF $NULL
This will compress each message as it comes in (and since most are TEXT, it does a fine job - MIME, OTOH is one of the best ways to mailbomb someone since it doesn't compress well - but the indirect bombing via mailing lists doesn't do this), reducing the disk space required, usually dramatically. Done in conjunction with something like the following at the end of your .procmailrc, you could have a header file you could quickly rummage through looking for valid messages to add to a procmail recipe, then run:
(note that if the recipe delivers into the mail.mbox.gz file on any condition, then you should look to MOVE the file before running this process, and use the moved version. In fact, this would be a good idea anyway, as newly delivered mail may appear in the end of the gzip file while you're doing this and since your ultimate goal is to be able to eliminate junk, you'll want to know that after you've processed a gzipped mail file, you can delete it without accidentally whacking new mail).
:0 * LASTFOLDER ?? ^^^^ { # Save the message in case we need to retrieve it. :0 c: |gzip -9fc >> $MAILDIR/mail.mbox.gz # copy headers for easy browsing - including being able to # identify lists you're being subscribed to. :0 h: header.log }
# # #
instead of leading dot file, you may prefer stopFile = $HOME/procmailrc.stop which shows up in default ls. In the other hand you can do ls ~/.procmail* to see both...
stopFile = $HOME/.procmailrc.stop :0 *$ $IS_EXIST $stopFile { EXITCODE = $EX_TEMPFAIL # Means: retry later; requeue HOST = "_stopped_by_external_request_"
Then, when testing your procmailrc and disaster happens, you can simply do following to disable your procmailrc filtering.
% touch $HOME/.procmailrc.stop
[richard] This is also a candidate recipe for including in an INCLUDERC. Combining the two ideas, we have a file procmailrc.stop which contains the recipe and is included near the top of .procmailrc, When you don't want it, mv it to procmailrc.go. Procmail complains about missing INCLUDERCs, but it does not complain about them if they exist and are empty. Another reason to not use dotted file names, but to use cp instead of mv.
10.0 Scoring
10.1 Using scores by an example
First make all the needed matches and let the SCORE value to be set. Examine the score after the final value has been calculated. The condition lines say: Start with some threshold: -250. Read the subject into MATCH Add 50 for each match of !. Notice the "^1": if it read "^0", only one 50 would be added for "!!!!", now that counts as 4 x 50 = 200. See procmailsc(1) for "^N" syntax. Any dollar sign is likely spam. find uninteresting subject words And a negative count for replies. Usually spam doesn't seem to have Re: in subject field. (but don't rely on this, spammers have started to use "re:") letters such as !!! frequently found in the body are usually indication of spam. Add 100 for each match.
# Idea by 26 Sep 97 Stephane Bortzmeyer <bortzmeyer A T pasteur.fr> :0 * -250 ^0 * ^Subject:\/.+$ * 50 ^1 MATCH ?? [!] * 50 ^1 MATCH ?? [$] * 100 ^1 MATCH ?? ()\<(free|sex|opportunity|money|great)\> * -250 ^0 ^Subject: *(Fwd|Fw|re): * B ?? 100 ^0 ()!!! { } # official procmail no-op SCORE = $= # Score has been calculated
procmail scores it this way: ! was found 4 times (200/weight 50), "free|sex..." regexp matched 4 times (400/weight 100).
condition score Total sum so far ------------------procmail: Score: -250 -250 "" procmail: Score: 200 -50 "[!]" procmail: Score: 0 -50 "[$]" procmail: Score: 400 350 "^Subject:.*\<free|sex|... >" procmail: Score: 0 350 "^Subject: *(Fwd|Fw|re):" procmail: Score: 0 350 ! "" procmail: Assigning "SCORE=350"
* 100^1 ^Subject:.*\<(free|sex|opportunity|money|great)\>
That condition says to score 100 for every subject line that contains any of those five words ... not to score 100 for every one of those words in the subject, but 100 for every subject line that contains any of those words. So it will never score more than 100 unless there are multiple subject lines. You see, it offers five alternative regexps:
Offhand, I think regexp below would score 400: 100 for "Subject.*free" and 100 for "sex" etc. Of course, the score might be higher if other lines in the header included the strings "sex", "opportunity", "money", or "great<word border>", but appearances of "<word border>free" outside the subject wouldn't be counted.
And this one would score 400 too. How? MATCH would contain whole subject and there would be non-
overlapping matches to " great ", " opportunity ", and " free ". If we got rid of either or both of the word-border marks, it would score 500.
Subject: Great opportunity for free sex; no money required!!!! * 100^1 MATCH ?? ()\<(free|sex|money|opportunity|great)\>
VERBOSE = "yes" :0 * 1^1 foo * -2^2 bar { } a = $= :0 * 1^1 foo * -2^2 bar { :0 f | echo Whee: fun ; cat } b = $= :0 * 1^1 foo * -2^2 bar { whee = "fun" } c = $= :0 h /dev/null
procmail: [20175] Fri Sep 26 10:25:23 1997 procmail: Score: 3 3 "foo" procmail: Score: -6 -3 "bar"
procmail: Assigning "a=-3" procmail: Score: 3 3 "foo" procmail: Score: -6 -3 "bar" procmail: Assigning "b=0" procmail: Score: 3 3 "foo" procmail: Score: -6 -3 "bar" procmail: Assigning "c=-3" procmail: Assigning "LASTFOLDER=/dev/null" procmail: Opening "/dev/null" From foo Fooof Folder: /dev/null 46
:0 * 10^0 { dummy = "Score for condition xxxx was: $= $NL" :0 { dummy = "Next recipe, Score no longer available: $= $NL" } } # # Wont' work. $= is getting set back to 0 outside of the delivering recipe.
Here is interesting anomaly which [richard] discovered. It is presented here only as a curiosity. DO NOT USE IT IN YOUR RECIPES. (this not "clean programming", but a hack) [david] If you want to save the score for later use (even if it is zero or negative):
# procmail no-op
If other recipes that clobber the references for the A flag intervene, this will work:
:0 * 10^0 { }
# procmail no-op
:0 * 1^1 . * 1^1 ^.*$ * -1^0 { } lines = $= :0 fhw * ! ^Lines: | $FORMAIL -a "Lines: $lines"
The reason we used it at all was that size conditions worked only on the entire text regardless of H or B or HB flags at the top of the recipe. Nowadays we can do this and get the accurate figure in one condition:
# leave `B ??' out to measure the entire message :0 * 1^1 B ?? > 1 { } size = $=
:0 * -1^1 B ?? > -1
{ } size = $=
gives the same result, and as long as the search area is non-empty, so do these, which are even sillier:
[Karr] This recipe counts bytes in the message, you could use this Content-length replacement, prefer using the next recipe. The first score counts every character, and the second score sums up every line (that is: newlines are added).
Bogofilter adds headers to the message that contains the propbability scode of the message being spam in range 0.0 - 1.0:
If the filter runs at MTA, the values that affects the word "No" at canoot necessarily be configured. To test directly the result score to catch messages in range 0.2 - 0.9 as "Unsure" can be done with scoring. If the spamicity value was 0.92, the first score would return: 1.90 - 0.92 = 0.98, which is lower than 1 the score OK value.
:0 * ^X-Bogosity:.*spamicity=\/0\.[0-9][0-9][0-9] { # check for maximum :0 * $ -$MATCH^0 * 1.90^0 { # check for minimum :0: * $ $MATCH^0 * 0.8^0 { # VAlue is betweeb A .. B } }
[stephen] All you need to do is: a) Make sure that procmail is started without the -Y flag. b) Either, in your sendmail.cf, insert:
H?l?Content-Length: 0000000000
Or (slightly less efficient), insert the following recipe in your /etc/procmailrc file and Procmail will take care of any necessary magic.
:0 *$ B ?? < $NBR
:0 # Note: this counts LINES * -1^1 B ?? . * -1^1 B ?? ^.*$ *$ $NBR^0 { ...whatever when fewer lines }
[richard] Here is recipe that needs no recursion. MAX_RECIP is set to 9, but you may prefer some other value. This counts each comma. It allowed in addresses.Some folks sum Resent-xx or nonResent-xx headers. I sum all.
:0 * 1^1 ^(resent|apparently-)?(to|b?cc):\/.* * 1^1 MATCH ??, *$ -$MAX_RECIP^0 { :0 *$ $=^0 *$ $MAX_RECIP^0 { RESULT = "Count of commas is $=" } }
That's not good. DON'T Do THAT. You just created expensive shell subprocess where procmail calls formail and feeds full message to it. We can do the same with minimum efforts:
No shell subprocess called. This is much faster and consumes fewer resources, while it may need more typing. Use it and your your sysadm is happy with your well behaving procmail recipes that don't load the CPU unnecessarily. The equivalent with formail might be more secure, because it contains full RFC-compliant parser. The traditional way of deriving the address with formail is:
But you can still make this more efficient. Here is one example where you actually want to use "old" =| style variable assignment, make sure there are no extra spaces:
:0 hw FROM=|$FORMAIL -rzxFrom:
That way only the header gets fed into formail, whereas the previous back tick fed the whole message. Another benefit is, that you can then check the return code of formail with a or A recipe after this one.
11.2.1 For procm ail versions prior 3.14 [FAQ] -r breaks RFC822, so always use -rt if you don't know what this means. Perhaps you should always use it anyway. [david] There is formail -r rank bar graph in the source code of 3.11pre4. It might be easier to follow as a top-to-bottom listing (and again, Tom Zeltwanger appears to be using one of the older versions where From_ was mistakenly over promoted). These are the rankings in version 3.11pre4:
t
formail -r: Resent-Reply-To: Resent-Sender: Resent-From: Return-Receipt-To: Errors-To: Reply-To: Sender: From_ Return-Path: Path: From:
formail -rt: Resent-Reply-To: Resent-From: Resent-Sender: Reply-To: From: Sender: Return-Receipt-To: Errors-To: Return-Path: From_ Path:
[Stephane Bortzmeyer <bortzmeyer A T pasteur.fr>] Always use -rt and never -r. Because such precedence (Sender over From) is an important violation of RFC 822. There is one canonical order, described in the RFC and nothing else should be used, like fuzzy ranking or, worse, reordering. This is a serious problem with formail. The proper order is:
And, how would you de al with re se nt mail?? I e : Re se nt-Re ply-To, Re se nt-From, and Re se nt-S e nde r?
It treats Resent-X as X (" Whenever the string Resent- begins a field name, the field has the same semantics as a field whose name does not have the prefix. "). So you have to choose an order between them, the RFC does not specify it. [david] I think that the idea is that -r is intended to determine the origination address, not the place to reply; -rt is for determining the place to send replies. For addressing a response, yes, -rt will invert the header in a way more in line with the rules; for figuring out the origination point,
formail -r -zxTo:
And here's an additional problem: formail -rD always uses the -r precedences; you can't make it use the -rt precedences and the -D cache checking function at the same time. 4.4.4. AUTOMATIC USE OF FROM / SENDER / REPLY-TO (RFC 822 excerpt) For systems which automatically generate address lists for replies to messages, the following recommendations are made: The Sender field mailbox should be sent notices of any problems in transport or delivery of the original messages. If there is no Sender field, then the From field mailbox should be used. The Sender field mailbox should NEVER be used automatically, in a recipient's reply message. If the Reply-To field exists, then the reply should go to the addresses indicated in that field and not to the address(es) indicated in the From field. If there is a "From" field, but no Reply-To field, the reply should be sent to the address(es)
indicated in the From field. Sometimes, a recipient may actually wish to communicate with the person that initiated the message transfer. In such cases, it is reasonable to use the Sender address. This recommendation is intended only for automated use of originator-fields and is not intended to suggest that replies may not also be sent to other recipients of messages. It is up to the respective mail-handling programs to decide what additional facilities will be provided.
In this chain I was sending message from my one account to another address, the virtual-address delivers the mail to right local domain. There is only one problem with this picture. When a response is generated from Local-address with formail -r, the generated address pointed back to virtualaddress , which pointed back to Local-address of course. A loop back was ready, you could not get the route to travel to original address: account What was happening here was that the mail server that handled the virtual-address, didn't forward the message, but instead resent the message. In this process a set of new headers were generated:
Resent-From: <virtual-address> X-From-Line: <account> Received: from <the virtual-address mailserver> Resent-Message-Id: <199710151903.WAA28670@virtual-address> Resent-Date: <date> Resent-To: <local-address> Received: ...<account domain> Message-Id: <199710151904.WAA05050@account-domain> From: <account-domain>
And now when the formail -r command was used, it picked up the Resent-From added destination where the message should be returned. Surprising, but according to procmail, 100% correct. ResentFrom has higher priority than From. The Resent-* headers are considered informative, and should never be used when automatically generating a response. The problem here is the middleman, it should not resend a message, but rather forward it. So I put this into my .procmailrc to handle the broken middleman in our site.
# #
[edward] adds to this that: As you know, formail -r is for composing a response to the address from which an e-mail was sent. Let's say you are on vacation and have set up a procmail recipe to auto respond to all e-mail you receive. Furthermore, let's say Joe sends me an e-mail and I re-send it to you. If you wanted to respond to the sender of the e-mail that you received, would you e-mail me or Joe? You better e-mail me becaus e I was the one who sent it to you. Joe may not even know you. Imagine if you did send your response to Joe. It would probably cause him considerable confusion as to why you are sending him e-mail informing him that you are vacation. formail -r uses a heuristic algorithm to determine who it should respond to, based on the presence of various headers and their contents. If you look at the formail.c source code, you'll see a graphical representation of this algorithm. Resent-Reply-To has the highest relative importance/reliability of all header fields. Next is ResentFrom and Resent-Sender, followed by Reply-To, From, Sender, et al.
#todo: :0 * ^^\/(.+$)+$ { header="$MATCH" } :0 fhw | $FORMAIL -r; ... now generate reply ...
We don't actually filter anything here. It's just a trick to reprint headers and add some text after them: text appears at the beginning of body.
:0 * condition { :0 fhb | $FORMAIL -rk -p '>' \ -I "From: [email protected]" \ -I "$MYXLOOP" :0 fhw | cat -; echo "added message at the start of body" }
# #
Strip header to bare minimum If this is MIME multipart, then skip recipe
:0 fhw * ! multipart | $FORMAIL -k -X Date: -X Subject: -X Message-Id: -X From -X To: -X Cc: -X Reply-To: -X Mime-Version: -X Content-type:
\ \ \ \ \ \ \ \ \
:0 : mail.default.mbox
[david] comments the final recipe You should keep the Reply-To header if there is one. If the sender wanted replies directed to a different address than that in the From header, you are losing that information and, when you respond, writing to the wrong place. You ought to keep To and Cc so that you can tell when you read your mail who else was sent it. If your mail user agent has a group-reply or reply-all function, keeping To and Cc will allow that feature to continue working. This way you are cheating yourself out of it. '-X From' is enough to keep both the From_ line and the From header. You don't need to specify -X From: again after it. (To keep From_ without From: you need to say -X "From " or something similar, with a quoted space.) All mail is going to have a line (usually two) beginning 'From'. Another slightly different approach is to kill the headers that take the most of the space. If you're not interested in tracking down the original sender of possible UBE message, then you can remove the Received headers. You may want to fill out the condition line to simplify only your work or campus messages, and let other messages retain their full headers.
:0 * |
The problem here is that there will be a newline in the middle, which causes the header to be shortened (procmail determines the new header/body boundary after having processed each filter). Use the following instead.
[david] If $HOME/newHeaders ends in a blank line, you don't need the "; echo". Under some circumstances procmail puts back the blank separating line if it gets lost, but I'm not sure exactly what those are, and you have a SHELLMETAS character in there already (the first semicolon), so a shell is forked anyway. But this is my favorite way (it assumes that formail -r will never generate a continuation line for From:); if you use it, make sure that the newHeaders file does NOT contain a trailing blank line:
:0 fhw * whatever
# #
If it more than one mail, send to formail for splitting, then send back to procmail for sorting again.
:0 * B ?? ^From [-_+.@a-z0-9]+ (Sun|Mon|Tue|Wed|Thu|Fri|Sat) * B ?? ^From: * B ?? ^TO *$ ! H ?? ^$MYXLOOP | $FORMAIL -A "$MYXLOOP" -m4s procmail
% setenv FILENO 0000 % formail -kXDate: -XFrom: -XTo: -XSubject: -XIn-Reply-To: -XX-Mailer +1ds procmail -p DEFAULT=`pwd`/'$FILENO.txt' /dev/null < inputfile
\ \ \
11.17 Mailbox: run series of commands for each mail (split mailbox)
...Maybe the he at has me lte d my brain, but I can't se e m to ge t formail to pe rform a se rie s of commands on e ach mail that it has split from a folde r. He re 's an e xample of a simple de bugging atte mpt: I 've trie d pare nthe se s, putting the commands into a she ll function, and othe r flailings too nume rous to re me mbe r, all to naught.
% formail -s addr=`formail -XFrom: | formail -r | formail -zx To`;\ echo "$addr" >>output
It appears that formail doesn't use the shell when executing the command specified when splitting. No SHELLMETAS here. Given that, the secret is to fire up the shell explicitly yourself to do the piping:
Note that you only need two formails in the pipe, not three, as the -r flag works correctly when combined with other flags.
...To me , a large mailbox would consists of about 10,000 me ssage s pe r month (that's about what I ge t). That would me an that my mailbox would contain 60,000 me ssage s in 6 months. I sure as he ck wouldn't want to skim through it all or e ve n try to load it up in an MUA.
[1998-08-27 Bennett Todd <bet A T mordor.net>] I also deal with monster volumes of mail. I've switched over entirely to Maildir in all my mail handling; the only place I still see mboxes is in the save folders of my netnews reading (using slrn) and whenever I want to process them I either convert them into Maildir (e.g. for archival) or simply split them into multiple messages. Splitting into multiple messages turns out to be preposterously easy; using GNU csplit:
[richard] The csplit invocation shown he re will catch occure nce s of ^From e mbe dde d in the me ssage body if your MUA hasn't e scape d the m with a >. S ome MUAs use conte nt-le ngth he ade rs and don't e scape ^From. Procmail supports this. Be cautious if you choose to use this simple split.
That will create an empty xx0000 which I delete, and leave the messages in files named xx0001, xx0002, etc. If you have more than 9999 messages in a folder then go -n6, or -n9, or whatever. Once they're split it's really easy to use shell tools to bundle messages into batches, file them into categories, etc. If you are archiving all mail traffic forever (which I do) then another dandy tool to add to the mix is glimpse http://glimpse.cs.arizona.edu/ it takes a while to build the index, but that's a fine job to run out of cron at night. Once the index is built it's a pleasingly quick way to root through big archives of messages.
And the file never exceeds 12288 bytes by very much. Though formail indeed exceeds this size by as much as the length of one message-ID, the file size should never grow significantly beyond that, even if used indefinitely. The file is in binary format, each entry terminated by single null byte, and an occasional (significant placeholder) double null [philip] The format of the cache is initially as follows:
entry\0entry\0entry\0\0
When the file size grows to equal-to or greater-than the size specified on the command line, formail starts over at the beginning, using a double-null to mark where it stopped. However, entries after the double-null, except for the partially overwritten one, are still valid and checked, so that the file is then in the format:
entry\0entry\0entry\0\0partial-entry\0entry\0entry\0\0
New entries will be written after the first double-null, so that it implements a circular cache. Check out lines 319-322 of formail.c
[david] This is strictly untested; I don't know where in the body the Message-ID's appear, but if they're at the top of the body, this might help:
:0 hW # Message-Id: in the head, *$ ^Message-Id:.*$NSPC | $FORMAIL -D $cache_size $cache_name :0 E bW # If not but there's one the body, try body. *$ B ^Message-Id:.*$NSPC | $FORMAIL -D $cache_size $cache_name
You might want to copy a Message-Id from the body to the head in any case (if there's none already in the head) just to have it in the right place, so we could do that first and then formail -D will work normally. This form will run formail twice if the Message-Id header is in the body instead of the head, but it will look for Message-Id on any line of the body, not just at the top:
:0 | $FORMAIL -r
Hm, we have three processes called here, can we minimize the calls? Yes, this is idea from [philip] and [david]. Notice that there is only ONE process needed.
${hdr1+"$hdr1"} ${hdr2+"$hdr2"}
And if you want to stack all headers to only one variable, it is a bit of extra work. Below we use short variable names only because of the line space: the calls fit on one line. field = all (f)ields stacked to one string. nl = continuation newline terminator of previous field The recipe says: if field has previous value, set nl to newline separator, later concat previous contents of field with possible newline and new header field.
# kill variable
The above recipe was the most general one, each recipe determined by itself if the f existed previously or not. But if you know that f is already set, you can write simpler recipe:
:0 {
formail -A "X-1: 1" | formail -a "X-1: 2" --> X-1: 1 formail -A "X-1: 1" | formail -I "X-1: 2" --> X-1: 2
should have done it, right? Nope; formail -s took it all as one message, even with -m1. When I edited in blank lines, the command worked. My first reaction was that the -e option wasn't working as advertised and that the blank lines were necessary after all. Then I realized the real problem: there was no interruption in the succession of valid header lines in the input for anything that could look like a body. I could have put something other than blank lines between each pair of header fields and then -e would have done its job, but as long as every additional line looked like a valid RFC822 header field, even if its name was the same as one that had appeared earlier, formail -s assumed that it was still the same message's head.
Once the mailing list name has been grabbed, you can easily "map" or convert the name to any suitable folder name before saving it:
LIST LIST name Description of mailing list (as grabbed) you want -------------------------------------------------------------jde java.jde Java Development Env java java.prog Java programming FLAMENCO flamenco Flamenco music tango-l tango Argentine Tango dancing tm-en-help tm-en Emacs TM mime package mailing list w3-beta w3 Emacs WWW mailing list
You set then conver grabbed LIST to new folder name with conversion table:
And to detect all mailing lists, you only need one recipe, like below:
When you receive message from any of these mailing lists to your login account, the list.procmail is already in variable $1 and the recipe to sink all mailing lists to their individual folders is very simple:
# # # # # # #
Note: The $1 contains value only _IF_ procmail is invoked with option -m or -a (with an argument). Be sure procmail is invoked with that oprion either as from LDA or ~/.forward. $1 is pseudo variable and it can't be used in condition line, so we copy the value to ARG.
\ \ \
Well, this is definition of the procmail mailer, not the local mailer. Furthermore, there's more to plusaddressing support than the definition of the local mailer. Ruleset 0 or 5 needs to be set up to move everything after the + into the 'host' variable ($h). Unless you have a strong understanding of sendmail rule sets and rewriting rules, you should not attempt to add plus-addressing to your sendmail.cf, but instead just install the latest version of sendmail and use the m4 sendmail.cf generation tools with a .mc file that contains:
FEATURE(local_procmail, `/usr/local/bin/procmail')
...Ok, I corrected it. Well, here's what that looks like. I did look into the part about Ruleset 5 while trying it on originally. But all I could do was make sure that the plus-addressing section was there. Mlocal, P=/usr/bin/procmail, \ F=lsDFMAw5:/|@qSPfhn9, S=10/30, R/40, T=DNS/RFC822/X-Unix, A=procmail -Y -a $h -d $u Mprog, P=/bin/sh, F=lsDFMoqeu9, S=10/30, R/40, D=$z:/, T=X-Unix, A=sh -c $u
Recall from [rfc1036] that the preferred Usenet mail address formats are following
From: [email protected] From: [email protected] (First Surname) From: First Surname [email protected]
I invented this idea after reading Eli's excellent FAQ about mail addressing. Please read it (especially section 19.) before you continue in order to understand what I'm going to present. I have an account which does not support plus addressing and I was kinda jealous to everyone that could use this neat sendmail addressing scheme. The plus addressing helps so much better to deal with mailing list messages. But as it turns out, we can simulate in some extent plus addressing with pure RFC compliant address. We exploit RFC comment syntax, where comment is any text inside parentheses. According to Eli's paper, comments should be preserved during transit. They may not appear in the exact place where originally put, but that shouldn't be a problem. So, we send out message with following From or Reply-To line:
Now, when someone replies to you, the MUA usually copies that address as is and you can read in the receiving end the PLUS information and drop the mail to appropriate folder: mail.procmail. [About subscribing to m ailing lists with RFC com m ent- plus address] It's very unfortunate that when you subscribe to lists, the comment is not preserved when you're added to the list database. Only the address part is preserved. I even put the comment inside angles to fool program to pick up everything between angles.
first.surname(+list.procmail)@example.com
But I had no luck. They have too good RFC parsers, which throw away and clean comments like this. Eg. procmail based mailing lists, the famous Smartlist, use formail to derive the return address and formail does not preserve comments. The above gets truncated to
Also many mailing lists send out messages as Bcc, so your address is not even available in headers anywhere, neither is this nice RFC comment. Ah well, but this RFC comment trick works very well in private communication, virtually all MUAs copy whole contents of a From or Reply-To header to To header, preserving comments and you get the benefit of plus addressing. Here is procmail code to demonstrate reading the PLUS information from RFC comment-plus field:
= $MATCH = $RC_EMAIL
# # #
If COMMENT_PLUS was defined, module found "+" address which contained, say, "mail.procmail". Save it to folder.
Pretty simple. And you can put anything inside RFC comment and do whatever you want with these plus addresses. NOTE: there are no guarantees that the RFC comment is preserved every time. Well, the standard RFC822 says is must be passed untouched, but I'd say it is 90% of the cases where mail is delivered from one server to another, it is kept. Example: if you discuss in Usenet groups, you could use address
Now, I collect specific high-volume mailing lists (like Debian) into their own spool files like above, and let other recipes catch all other mailing lists (like procmail and fvwm) into a single spool file with later rules:
# Majordomo lists
# SmartList lists
So Debian mailing list mail goes to Debian, procmail and fvwm mail go to mail lists and mail addressed to me yet CC'ed to a list go to my main spool file.
The following code will save the message to folders list.foo, list.bar, list.procmail when the name is in the TO address.
# #
There's one tricky thing to note: if someone sends a message to both me and the list (say, responding to a message I sent to the list), then the copy that got to me through the list will end up in my procmail folder, while the copy that went directly won't. I like this behavior, but some people, possibly yourself, may prefer it if both messages end up re-filed. If so, your best bet is to combine the above with matching against the To: and Cc: headers via the ^TO_ token:
(If you have a version of procmail before 3.11pre4, then you'll need to use "^TOprocmail" instead of "^TO_procmail".). If you're subscribed to many mailing lists, here is one general recipe Notice: you don't want to include < in the recipe like: ^TO_\<\/$LISTS because The ^TO_ token contains something similar to \< but better, so that the \< can only cause problems. A trailing \> is not a bad idea, though because it's not a zero-width assertion but rather an actual character class, you have to strip it from the match
LISTS # # #
= "(foo-list|bar-list)"
1) to get the match 2) rematch sans the trailing \> 3) Note: preserves capitalization of the string
VAR = "MOO" what = "(moo|bar|baz)" :0 *$ VAR { # # # # Search what from VAR ?? ()\/$what Now; what is was that really matched, there were several choices: moo,bar,bar Beware: $MATCH must not contain regexp characters
:0 *$ what ?? ()\/$MATCH { }
# no-op
Most likely the sender is using Exchange (or Windows Messaging or Outlook97) and sent the messages in Rich Text Format. It puts the RTF message in an attachment called WINMAIL.DAT (application/ms-tnef). But this attachment is useless unless the recipient is also using Exchange. The sender can turn off the RTF option for messages to you. For more information, see: "XCLN: Sending Messages In Rich-Text Format" at http://support.microsoft.com/kb/136204
Some more examples can be found from section: 'Explaning ^^ and ^'
TXT_NO_HTML = $HOME/noHTML.txt :0 * ! H *$ ! H * HB * HB {
?? ?? ?? ??
LOG = "$NL --TRASH: multi-part HTML $NL" :0 | ($FORMAIL -rk -A "X-Mailer: Procmail Autoreply" -A "$XLOOP" ; cat $TXT_NO_HTML ) | $SENDMAIL }
\ \ \ \ \
:0 * B ?? ()<HTML> * B ?? ()</HTML>
:0 * conversion ?? lynx { # In new lynx version you can read from stdin. If # /dev/stdin doesn't exits try /dev/fd/0 # # lynx -dump -force_HTML -nolist -restrictions=all \ # /dev/stdin # # Without a global lock on this, you have a chance # that two procmail instances will try to write to # msg.dump file = "$HOME/tmp/msg.dump" LOCKFILE = $file$LOCKEXT :0 fbw | cat > $file && lynx -dump $file LOCKFILE } :0 E fbw | perl -0777 -pe 's/<[^>]*>//g' }
HEADERS --mime-boundary plain text --mime-boundary Some idiotic HTML (or other type) copy of the text --mime-boundary
Good news. There's a procmail module that addresses this problem. The module can kill any mime attachment and the predefined sets include typical cases: Microsoft Explorer has a bad habit of including 7k application/ms-tnef attachment to the end of
message. Lotus Notes sends similar extra attachment. Microsoft Express sends a copy of message in HTML format in the attachment. Netscape's Mozilla sends a copy of message in HTML. See example. It Also sends annoying vcards. The module is called pm-jamime-kill.rc and included in Jari's pm-code.zip. (Note: Procmail module list)
GetFile: ~user/.login
We make the script safe here by forcing http://$MATCH and not simply using "$MATCH"
:0 *$ ^Subject:$s+GetPage:()\/.* *$ ! ^$MYXLOOP | ($FORMAIL -r -I "Precedence: junk" -I "Subject: Requested page: $MATCH" -I "$MYXLOOP" ; lynx -dump "http://$MATCH" )| $SENDMAIL
\ \ \ \ \ \
[era] If all you need is to create a suitable MIME package, there are various MIME command-line utilities such as metasend (which is for interactive use, and so doesn't work very well with Procmail) and mpack you can try. If your needs are simple, you could even read up a bit on the MIME spec and generate the necessary headers and separators yourself (echo Content-Type: multipart/mixed etc etc etc). Conversely, if your needs are complex, get the Perl MIME package from CPAN and cook up your own tool. The MIME FAQ (especially part 6) is a good place to look for info. http://www.faqs.org/faqs/by-newsgroup/comp/comp.mail.mime.html
:0 * condition dir-folder/.
[manual] When delivering to directories (or to MH folders) you don't need to use lockfiles to prevent several concurrently run- ning procmail programs from messing up.
On a save to a dire ctory, how doe s procmail de te rmine what to put afte r $MS GPREFI X to comple te the name of the file ?
[philip] It's the inode number of the file encoded in base-64 with the set of characters A-Za-z0-9-_, in reverse order. So, for example, the inode numbered 59699 would be encoded as follows:
59699 = 51 + 64 * ( 36 + 64 * 14 ) A=0, B=1, ..., N=13, O=14, ..., a=26, ..., k=36, ..., z=51, 0=52, ... --> zkO
:0 fbw | echo "This is a line of text _before_ the body"; \ cat :0 fbw | cat - ; \ echo "This is a line of text _after_ the body" :0 fbw | cat msg.txt :0 fbw | cat - msg.txt :0 fbwi | cat msg.txt # prepend text before the body
# SysV's cat has a different meaning for -s and cannot do this :0 fbw * B ?? $$$ | cat -s
Note that cat -s has slightly different results from the others: if there are any empty lines at the top of the body, cat -s will keep one. The echo and sed suggestion will remove all empty lines from the top and, like cat -s, keep one at the bottom.
{ XM = $MATCH } :0 * ()\/^Message-Id: +\/.* { MID = $MATCH } :0 * ()\/^Date: +\/.* { DATE = $MATCH } :0 * ()\/^To: +\/.* { TT = $MATCH } :0 * ()\/^CC: +\/.* { CC = $MATCH } :0 * ()\/^Subject: +\/.* { SUBJ = $MATCH } :0 fh w | $FORMAIL ${XM:+-I"X-Mailer: $XM"} ${TT:+-I"To: $TT"} ${FROM:+-I"From: $FROM"} ${RT:+-I"Reply-to: $RT"} ${CC:+-I"Cc: $CC"} ${MID:+-I"Message-Id: $MID"} ${DATE:+-I"Date: $DATE"} ${SUBJ:+-I"Subject: $SUBJ"}
\ \ \ \ \ \ \ \
:0 * ! B ?? ... | (echo "From: [email protected]" ; $FORMAIL -r -A"Precedence: junk" -A"X-Loop: [email protected]" ; echo "Your blank message was received.\n" "Did you mean to say something?\n" "\n" "-- \n" "My Signature\n"
\ \ \ \ \ \ \ \
:0 * ^Subject: ping$ { :0 fh | $FORMAIL -r # # # Remember, Don't send back anything that would be vital to attacker. It doesn't matter if the `uptime` or other scripts fail, the reply is sent anyway.
:0 c # Record this ping request | ( cat -; echo `uptime`; echo "$HOST User count: " `who | wc -l`; ) | $SENDMAIL :0 : $PING_SPOOL } # or sink to $DEFAULT
\ \ \
Some people like to raise a flag in .procmailrc instead of creating a file. If you like the variable approach better, here is the equivalent implementation of the above
[philip] and [era] Since vacation only sends replies it never sends the original # messages, one way to do two things with your .forward file. Substitute "abc" with your login name.
OFFSITE = "[email protected]" # # # # Forward urgent mail to me at my off site address; afterward, continue processing it as normal The procmail pattern match may be case-insensitive, in which case this rule could be simplified...
# # # # # #
Use "vacation" to tell other people I'm not here To enable, un-comment the next two lines; to disable, comment them out The -a Identifies another name that can legitimately appear in the To: line of the mail header instead of your login name
Subject: I'm out of town for a while From: eric (via the vacation program) I'm out of town until <return-date>. Your mail regarding "$SUBJECT" will be read when I return, or possibly at some unknown time before then if I get a chance to check for mail. If your message must be seen by me before I return, you can send it with the word "URGENT" in the subject header. Such mail will be automatically forwarded to me so that I see it sooner. --Eric
# # #
look for the file to tell us whether or not to forward mail if the file exists, forward the mail or not
ELSWHERE = "[email protected]" FILE = "$HOME/.forwardmail" :0 c *$ ? $IS_EXIST $FILE ! $ELSWHERE # # # if a message arrives from the other account with the Subject 'forward-off' then remove the file, efectively turning off forwarding
:0 hwic *$ ^From:.*$ELSWHERE * ^Subject: forward-off | $NICE mv -f $FILE $FILE.off # # # if a message arrives from the other account with the Subject 'forward-on' then remove the file, efectively turning off forwarding on
# # #
By Jim Hribnak <hribnak A T nucleus.com> [email protected] goes to [email protected] [email protected] foes to [email protected]
:0 * ^TO_()[email protected]\> { FORWARDTO = "$FORWARDTO [email protected]" } :0 * ^TO_()[email protected]\> { FORWARDTO = "$FORWARDTO [email protected]" } :0 fhw * FORWARDTO ?? @ * ! ^$MYXLOOP | $FORMAIL -A "$MYXLOOP" :0 a ! $FORWARDTO
[david] sed could do both at once, but the problem is that sed never knows when it is N lines from the end if N>0; it knows the last line when it reads it, but when it is looking at the next-to-last line it doesn't know that there is only more one line to come. It does, however, know how many lines of input it has already read. So I have three suggestions: if you know that the header is X lines long [let's say 5 for this example] and that the first line of the footer contains some string or pattern that will not occur in the significant part of the post,
If you recognize the end by the last line that you want to keep instead of the first line that you want to delete, omit the n option and the p instruction:
Finally, if the only reliable way to spot the footer is by reaching so many lines from the end (because any search pattern might occur in the real text as well), we can score as you've been doing to get the number of the last significant line. Let's say the footer is three lines long; because ^.*$ always counts one line too many (long story), we subtract four instead of three:
newsgroup. A kill file usually contains one single entry per line to match the message content and this can be easily done with procmail. Remember however that for every message procmail forks a process, so before you apply the kill file rules to the messages, be sure your recipes are in this order: the kill file rules are applied only to unknown messages
SINK MAILING-LISTS SINK ANNOUNCEMENTS SINK WORK MESSAGES OTHER DELIVERIES apply kill file rules and UBE recipes to the rest
Recipe will drop the message (i.e. consider it 'delivered') if one of its headers matches a pattern in kill file.
The reason why there is explicit lock file is that you must be able to update the kill file while your procmail is running. An example edit script is presented below.
#!/bin/sh # program: kill file.sh # file=$HOME/.kill file lock=$file.lock cp $file $file.tmp emacs -q $file # or use whatever you prefer: vi, pico lockfile $lock mv $file.tmp $file rm -f $lock
LOCKFILE = $MID_CACHE_LOCK # IF the message has a message-id header # AND formail -D is successful (exit status=0) # THEN
# #
:0 * ^Message-Id: * ? $FORMAIL -D $MID_CACHE_LEN $MID_CACHE_FILE { LOG="dupecheck: discarded message, $SUBJECT $NL" :0 duplicate.mbox } LOCKFILE # Release lock by killing variable # Store duplicates, notice no lock!
And here is a bit simpler recipe, a slightly modified version from the [manual]. Procmail notices formail's success, considers the message delivered and does not stop processing the rcfile due to c flag, which let's a message to fall into safety copy inbox.
There was a pretty heavy thread around September 1997 about duplicate detection, where some promising stuff was posted. One item you should definitely have in your collection is Eli's hashd [in Procmail mailing list 1997-09]
# # #
1. 2. 3.
mail to my domain NOT addressed to me directly NOT coming from mailing lists I'm subscribed to.
[Gordon Matzigkeit <gord A T m-tech.ab.ca>] I have just discovered an effective rule for separating SPAM from the rest of my e-mail. Just substitute your username for gord in the line below
This only works because I handle all mailing list addresses above that point in my .procmailrc (i.e. all traffic that arrives from mailing lists that I am subscribed to goes into other folders). Most SPAMmers seem to do it nowadays by sending mail via mailing lists, rather than creating huge To lists of users Many times sysadm install a list of know addresses that send spam and then they check the incoming mail against the "black list". Keep in mind that that some fgrep implementations have a problem with the -w word switch. Note that the above recipe scans the FULL HEADER, so use it with some caution, i.e., be careful what you add to your list of spam domains.
# # # # # # #
by [philip]; egrep would do here too, if it is posix compliant, it may have -f switch that makes it behave like fgrep. Note: option -F would make [ef]grep to search fixed string instead of regexps.
BLOCK_FILE UBE_MBOX # # # # # # # # # # # # # # # #
= $HOME/Mail/DeniedNames.lst = $HOME/Mail/junk-ube.mbox
To filter out the Subject lines, so that mails sent with the subject "Have you received a message from blah-blah@spam" don't get filtered. [era] suggested we use formail Edsel Adap <edsel.adap A T Canada.Sun.COM> agrees there is a likely bug in Solaris 2.5.1 "/usr/bin/fgrep -i" and suggested the use of /usr/xpg4/bin/fgrep instead. <edsel.adap A T canada.sun.com> Sun Microsystems Developer Support Files in /usr/xpg4 are available via the SUNWxcu4 package, which is part of the user, developer, all, or Xall Solaris clusters. Solaris 2.4 doesn't have /usr/xpg4/bin/fgrep :-(, you must use `tr A-Z a-z' before piping the message to fgrep.
[Adam Shostack <adam A T bwh.harvard.edu>] The following do help, although they're often too broad. (I use a .safe rule to cover those cases) The < 1000 is a useful hueristic. It's rare that unsubscribe messages are long.
[Rodger Anderson <rodger A T hpbs2245.boi.hp.com>] I've been working on a recipe to filter out those pesky s*bscribe and uns*bscribe messages from mailing lists, and I'm posting what I have so far. As an aside, it also filters out very short messages, which I've found are usually some sort message meant for list owner/request address. I give heavy weight to Subjects starting with (un)?s*bscribe, with also pretty heavy weight to Subjects containing either of those words. I then give heavy weight to the body of messages starting with those words, and a lighter weight to lines starting with them. Then multiple occurrences get some weight too, up to a point. Then I count the words in the message against all that.
:0 * 1^0 * 30^0 H ?? ^Subject: +(un)?subscribe\> * 20^0 H ?? ^Subject:.*\<(un)?subscribe\> *$ 20^0 B ?? ^^$SPCNL*(un)?subscribe\> *$ 10^0 B ?? ^$SPC*(un)?subscribe\> * 8^.4 B ?? \\<(un)?subscribe\> * -.4^1 B ?? \\<$a+\> junk.misc.mbox
[Adam Shostack <adam A T bwh.harvard.edu>] How about looking for sub & unsub, as well as a perennial misspelling 'unsuscribe me'? I also find filtering on add, leave and help to be useful. This may well be the only word on the line. I think it has to do with broken list management packages.
| :0 | * 1^0 | * 30^0 H ?? ^Subject: +(un)?subscribe\> * 20^0 H ?? ^Subject: +(un)?sub?(scribe)?\> (The B is often missing, as is the word fragment 'scribe') | * 20^0 H ?? ^Subject:.*\<(un)?subscribe\> * 20^0 H ?? ^Subject: +(add|leave|help)$ # fewer points if more words * 15^0 H ?? ^Subject: +(add|leave|help)
[david 1998-10-20] You want to match on messages where the first non-blank thing in the body is "unsubscribe" at the end of a line, where there are five lines or fewer in the body?
^.*$ always counts one line too many, so a five-line body will be counted as six; that's why we need a prejudice of 7. But if the first non-blank text in the body is "unsubscribe" alone on a line, is a line count really necessary? True posts that include the word will have it in the middle of a sentence, such as the preceding one. What you'll find by specifying a line limit is that unsubscribe requests with long signatures or attachments at the bottom of a previous message will get through.
YYMMDD_FILE = $HOME/.yymmdd YYMMDD = $YY-$MM-$DD # # Contains single line of procmail code YYMMDD_PREV = ..
INCLUDERC = $YYMMDD_FILE # # If different date, then enter this block The echo updates stamp in file.
:0 *$ ! YYMMDD ?? ^^$YYMMDD_PREV^^ * ? echo "YYMMDD_PREV = $YYMMDD" > $YYMMDD_FILE { ...do the cron jobs.. }
[david] How do your From_ lines look? If they're the traditional kind that sendmail and smail add, they include the local time on your system at receipt. So include a check that the hour is between 07 and 22 inclusive, like this:
I included the minutes and the colon that separates the minutes from the seconds so that the expression for testing the 07-22 range can match only on the hour.
And if your cron doesn't know the HOME variable (that'd be an exception)
0 23 * * * 0 7 * * *
the script will run_my_program only if both the subject matches and the file test succeeds. The file test will succeed only between 11pm and 7am. In all honesty, if system gives usab le From_ lines, I like following suggestion better. I use it all the time to turn blocks of procmail code on and off at given times or dates, and it works likes a charm. It uses many fewer processes and is less likely to get the status wrong if for any reason one of the cron jobs fails to run or doesn't do its job. This pages only at day time
:0 * ^Content-Type: *text/plain { :0 fbw * ^Content-Transfer-Encoding: *quoted-printable | mimencode -u -q :0 A fhw | $FORMAIL -I "Content-Transfer-Encoding: 8bit" :0 fbw * ^Content-Transfer-Encoding: *base64 | mimencode -u -b :0 A fhw | $FORMAIL -I "Content-Transfer-Encoding: 8bit" }
# # # # # # # # #
1995-10-18 Tim Pickett <tbp A T cs.monash.edu.au> Decode MIME quoted-printable Content-Transfer-Encoding Conditions Mail has a MIME-Version header with a number in it. Header saying "Content-Transfer-Encoding: quoted-printable" exists
:0 *$ ^MIME-Version:$s*$d*(\.$d*) *$ ^Content-Transfer-Encoding:$s*quoted-printable { :0 fhw # Remove header | $FORMAIL -I"Content-Transfer-Encoding:" :0 fbw | mmencode -u -q } # Decode the body.
one
two
but save mail in e rror-folde r if the line as only the first string like : one (string two is missing)
[philip] I presume these lines would be located in the body of the message, and that by "space between one and two" you mean "whitespace between one and two". If those assumptions are wrong then you'll need to tweak the following recipes:
# # # #
The 'B' tells procmail to look in the body instead of the header. The second colon tells procmail to lock the mailbox with a local lock file -- if mailbox is a directory then you don't need it. The brackets in the condition contain a space and a tab.
Now, the above will match even if "one" or "two" is part of another word (at the end in the case of "one" and at the beginning in the case of "two"). If you don't want that then you'll need to change the recipes to read:
:0 : *$ $to_()[email protected] address-matched.mbx
From: [email protected] Subject: Fault: NNNN in program block YYY Fault: NNNN in program block YYY
<< changed
:0 fhw * ^Subject: NOK case report *$ B ?? ^$s*\/Fault: [0-9a-f]+ in program block.* | $FORMAIL -I "Subject: $MATCH"
# By [alan] # Examine headers, create a subject tag if we recognize a list TAG = "" :0 *$ ${HEADERS}[email protected] { TAG = "info" } :0 E *$ ${HEADERS}[email protected] { TAG = "check" } # ...and so on... # now, if TAG is set, insert it into the subject MATCH # kill this
Or you could use the command line arguments, add following line to your .forward. (alias file syntax)
The stdout of myprogram will be captured at stored in the variable RESULT. Also consider what should happen if there are spaces or tabs in the value of $FOUND. Perhaps it should be better off enclosed with quoted.
You do
But sometimes you don't have control over the files, then you can do this to make sure there is blank line. Notice, only two processes used compared to first choice.
$: the last line !: everywhere except the range (in this case, everywhere except the last line) b: branch to a label. No label: branch to the end (and, since -n is not in effect, print the pattern space) Now remember that everywhere except the last line, we've skipped ahead, so the rest of the code will be executed only for the last line of the input. /./: on lines that contain a character (but we get here only for the last line, so on the last line if it contains a character) G: append a newline and the contents of the hold space to the pattern space (the hold space is empty, so basically, if the last line was already empty, do nothing, but if the last line was not empty, append a newline and thus add a blank line after it). r file: After finishing with this run through the sed instructions, read the named file and copy it to the output. This side of sed comes out only after sed has had a few drinks...
[era] I would definitely tolerate 75% quotes. And in the end, you will of course always have to face the kinds of people who would rather change their quoting style to evade such constraints than quote less. An idealized quote parser should perhaps realize that a non-blank prefix that recurs on a lot of lines is probably a customized quote string. This will preserve the correspondent's original subject (with a Re: added if it didn't already have one) and thus the template text should indicate the nature of the problem. I'm not sure what would be appropriate to generate behavior more like I suggest below, any takers? Perhaps no score at all for empty lines, neutralize .signatures (hope sender obeys "-- " convention) and add 10^0.5 for each quoted line and dish out -15^0.3 for non-quoted? (I haven't really explored this could be completely up the creek.) [Also, perhaps long runs of quoted material should be penalized harder than quoted snippet reply text quoted snippet reply text alternations?]
COPY_ADDRESS = "[email protected]" :0 * ^Sender: <mailing list tag> { # - quoted lines # - non-blank, non-quoted lines # - completely blank lines :0 *$ 10^1 B ?? ^$s*> *$ -15^1 B ?? ^$s*[^>$WSPC] *$ -15^1 B ?? ^$s*$ { # You don't need to repeat the original condition here # You also don't really need to extract SENDER # Generate a reply with appropriate headers and the # body quoted :0 fhw | $FORMAIL -rk -A "Bcc: $COPY_ADDRESS" # Now "replace" the body with template text + body (In # other words, add the template before the quoted body) :0 fbw | cat $HOME/template.txt # Now send it off to recipients mentioned in generated # header ! -t } # Wasn't excessively quoted; save it :0 : $SOME_MBOX
[era] This stuff about forwarding to pagers is a recurring topic on this list. I've tried to find a good summary of all the issues but there always seems to be some tiny twist to what people would like to have implemented. As a general comment for future generations, the Procmail part is usually trivial
and the problem reduces to writing a good program (shell script or otherwise) for formatting the text precisely the way you want it, and spitting it out in suitable chunks. Here's something to split up the body of the message into smaller chunks and do a shell script on each chunk. The -s option to fold says to only wrap lines on whitespace if possible
# # :0 c {
Create a duplicate of the message to forward to the pager. This will be reformatted and have most headers stripped off.
# Construct header with only From: and Subject: retained HEADER = `$FORMAIL -XFrom: -XSubject:` # # # Reformat body as 200-character lines and send each as a separate message with the preconstructed minimal header
:0 bw | tr '\012' ' ' | fold -s -w 200 | while read line; do echo -e "$HEADER\n\n$line" | \ $SENDMAIL [email protected] ; done }
If your version of echo doesn't understand \n to mean newline (and/or the -e option to enable this escape processing), you need to tweak this. (You might need to anyway this is mostly untested. In my limited testing, I found the messages would arrive in more or less random order. Inserting pauses in the script should help to some extent, but could lead to other problems and is not an ideal solution anyhow.) I don't know if the header trimming is required; some pager gateways appear to count the headers as part of the message, while others don't. Again, for future generations, details like this are relevant to include when you ask about how to do this.
Strange. The command works from the shell if I su to user mail. Anyway, I got it to work by fully specifying the audio server (which is my workstation, where I receive mail)
AU TUNE
= /usr/X11R6/bin/auplay = /usr/lib/exmh/drip.au
# #
ORIG_TO ORIG_CC # The -c option to formail takes care of headers continued onto # indented lines; the pipe to tr takes care of multiple # Original-To: headers by linking their contents with commas. :0 * ^Original-To:.*[^ ] { ORIG_TO = `$FORMAIL -zcxOriginal-To: | tr \\12 ,` } # Drop trailing comma from tr: :0 A * ORIG_TO ?? ,^^ * ORIG_TO ?? ^^\/.*[^,] { ORIG_TO = $MATCH } # Likewise for Original-Cc: lines:
:0 * ^Original-Cc:.*[^ ] { ORIG_CC = `$FORMAIL -zcxOriginal-Cc: | tr \\12 ,` } :0 A * ORIG_CC ?? ,^^ * ORIG_CC ?? ^^\/.*[^,] { ORIG_CC = $MATCH } # # # # Now, let's install the changes if needed: with -A instead of -I or -i it should not clobber existing To: or Cc: information. -A : Append a custom header field onto the header in any case.
:0 * ORIG_TO ?? ^^^^ * ORIG_CC ?? ^^^^ { } :0 E fhw | $FORMAIL ${ORIG_TO:+-A "To: $ORIG_TO"} ${ORIG_CC:+-A "Cc: $ORIG_CC"}
\ \
# # #
by [alan] See if addressed *directly* to me, and .. ..has not already been forwarded = "TheMagic" = "[email protected]"
KEY FORWARD_EMAIL
:0 *$ ^To:.*$LOGNAME(@|[^0-9a-z]|$) *$ ! ^$MYXLOOP { # now let's encrypt the body using mimencode :0 fbw | echo "MIME-Version: 1.0" ; echo "Content-Type: application/crypt" ; echo "Content-transfer-encoding: base64" ; echo "" ; crypt $KEY | mimencode -b # # Now let's prepare the headers for forwarding the mail, and mark it so we don't loop
\ \ \ \
:0 fhw | $FORMAIL
:0 ! $FORWARD_EMAIL }
:0 fbw * B ?? PGP ENCRYPTED MESSAGE | pgp -z "your pass phrase" -f +batch 2>&1
:0 * From [email protected] { :0 h * >10000 /dev/null :0 h *^Subject:.*no keys match /dev/null :0 E | pgp +batchmode -fka }
:0 * From [email protected] { :0 h * >10000 /dev/null :0 h *^Subject:.*no keys match /dev/null :0 E | pgp +batchmode -fka } # # # # auto key retrieval I have an elm alias, pgp, points to a key server The log file gets unset briefly to keep the elm lines out of my log file.
:0 W * B
?? -----BEGIN PGP
* H ! ?? ^FROM_DAEMON { KEYID = `/usr3/adam/bin/sender_unknown` } LOGFILE= # # # #todo: We should get rid of the 'elm' dependency here. #todo: correct this sometime... [jari]
:0 ahc * ! ^X-Loop: Adams autokey retrieval | $FORMAIL -a"X-Loop: Adams autokey retrieval" | elm -s"mget $KEYID" pgp
#!/bin/sh # # Script: sender_unknown # # unknown returns a keyid, exits 1 if the key is known. $output # is to get the exit status. Otherwise, this would be a one # liner. OUTPUT=`pgp -f +VERBOSE=0 +batchmode -o /dev/null` echo $OUTPUT | egrep -s 'not found in file' EV=$? if [ $EV -eq 0 ]; then echo $OUTPUT | awk '{print $6}' fi exit $EV # end of sender_unknown
Yes, the control is returned to the original file where the includerc was called from. And No, mail does not get delivered in the $DEFAULT because the includerc just ends: processing continues until there is no more statements in the top level. Includerc is nothing more that a sliced top level recipe.
somewhere that I don't know beforehand (or I have just forgot to tweak my .procmailrc)
ME LISTS
= "([email protected])" = "(procmail|list-a|list-b)"
:0 # Idea by Bill Moseley *$ ! ^TO_()$ME *$ ! $LISTS { # Could be UBE or I might be on a unknown distribution list. INCLUDERC = $PMSRC/pm-ubecheck.rc }
[dan] That would work; common practice, however, is to put recipes for filing mail from lists (and, per Bill's preferences, anything mentioning procmail in the head gets treated the same as mail from this list) first; then the only remaining condition to consider there would be unexpected blind carbons: * ! ^TO_moseley. This method is good if you get much more spam than legitimate mail (including mail from list subscriptions as legitimate) and you want procmail to deal with spam right away. I belong to several very active mailing lists, so I actually receive more pieces of legitimate mail than pieces of spam. One way to get the best of both worlds is this:
*$ ! ()\/(^TO_$LOGNAME|procmail|list-(ABC|123|XYZ))
because then, if the regexp matches (and thus the negated condition fails and you don't detour into $PMSRC/checkspam.rc), MATCH is already set to the name of the mailing list, and you can do further tests by just examining MATCH (or a variable you copy it into) instead of a repeating a complete head search. [I prefer to use the variable $LOGNAME rather than hard-coding my name because then others can use the code, and I can use it unchanged on sites where my logname is different, and if my logname is changed my procmailrc will keep up with it.] For example (I've separated the conditions into two lines so that, per Bill's preferences, a mention of procmail in the head will get the message into the Procmail List folder, even if a match to $^TO_$LOGNAME is also present and appears sooner):
:0 * ! ()\/(procmail|list-(ABC|123|XYZ)) *$ ! ^TO_$LOGNAME { INCLUDERC=$PMSRC/pm-ubecheck.rc } # # # # # The next recipe has an `E' flag, so it will be examined only if the preceding one didn't match; thus if $MATCH was set inside pm-ubecheck.rc, it won't hurt anything here, and a value for $MATCH set in pm-ubecheck.rc won't be mistaken for a list name:
:0 E: # MATCH is non-null only if it matched a list name * MATCH ?? (.) $MATCH # # # Remaining recipes will be read only for two types of mail: those that met $^TO_$LOGNAME but not any expected list name, and those that went through pm-ubecheck.rc but came out
undelivered.
:0 * ! { }
It says that "If variable WSPC does not contain space, then load module". If the module has already been loaded by some other rc file, the WSPC would exist. If it does not exist yet, then the module is loaded. This is classical example of conditionally loading functions or variables into current module:
Justin Lloyd <jlloyd A T harris.com> suggest a general way of caching the included rc files. Use toplevel script that records every module that was included. The module is loaded only if it it not yet included:
pm-xximport.rc
:0 * ! INCLUDE_CACHE ?? ()\<$RC\> { # Module was not there yet, add it to the list INCLUDE_CACHE = "$INCLUDE_CACHE$RCFILE$NL" INCLUDERC = $RC }
This is different approach then the previous one. Instead of checking features, the presense of module is checked. Two sides of the coin which can be used for the same thing. You can pick either solution but here are some thoughts: Adding extra top level INCLUDE_CACHE is extra work. Procmail must open a separate top-level rc file every time with call
RC="pm-xxscript.rc"
INCLUDERC=pm-xximport.rc
If feature already existed, you would still have to open the pm-xximport.rc file for every call to find it out. E. g. here you pm-xximport.rc is called 3 times no matter if 1, 2, 3 were already present
With previous simple feature test, procmail can evaluate the condition in place without the need of opening separate file:
Note however, that both suggestions accomplish the same thing; the implementation is only different. If the typical count of including RC files per module were big enough, I'd use justin's way. Usually it's around few, say one or two, whose purpose is to define variables of get date information.
% ls rc*
but If you would like to print all p rocmail relates files and backup them with one command, the starting prefix is better:
A name foramt could be pm-xxSCRIPT-NAME.rc for a rc file where xx is the initials of first name and surname, like (J)ohn (D)oe. These scripts are product versions, that can be distributed. There also is usually private scripts that handle other things, like mailing lists, work messages and so on. They vould have a prefix my.
pm-jdscript.rc pm-myscript.rc
When downloading someone else's script it would be good if it's name were unique according to person who made it:
pm-ajscript.rc
call arguments
output values
pm-xxscript1.rc --> +------------+ | black | --> it may call | box | other subroutines | | <-- pm-xxscript2.rc <-- +------------+
Procmail does not have local variables, so you must put the variables to global name space. Let's see an example where subroutine uses MAILDIR for chdir purposes.
MAILDIR_xxscript1 = $MAILDIR ... MAILDIR = new location ... ...at the end of subroutine MAILDIR = $MAILDIR_xxscript1
# save
# restore
Here the original value is saved when subroutine started and the original value was restored when subroutine exited. The global namespace (xxscript1) used was unique and is guaranteed not to clash with anyone else's. If the pm-xxscript2.rc would have also used MAILDIR the saved value would have been in
PROCMAILVAR_xxscript2
and the two wouldn't mix up with each others MAILDIR. The general name for saved variable is therefore:
PROCMAILVAR_scriptname
This follows the simple "onion" or "stack" model, where variable's value is saved before changing it and restored on exit point.
restore-x-1
[script pm-xxscript.rc] # ........................... public XX_SCRIPT_FLAG = ${XX_SCRIPT_FLAG:-"default"} XX_SCRIPT_VAR = ${XX_SCRIPT_VAR:-"default"} # ........................... private charset = "a-z1-2" regexp = "something-that-matches"
Whether you need to stick prefix xx_script to the private variables depends on whether you call another includerc which may happen to use same names as you:
# watch this # call another subroutine # holy cow, it used same variable
In this case it would be wise a) not to define charset at the top of the file but to move the definition to just before the recipe where it is used or b) make the name unique, with xxScriptCharset.
Always include version number or last modification date somewhere. Prefer some version control tool, like RCS, VCS, ClearCase, whatever you have at hand. Use a variable name like dummy in appropriate places to tell what's happening in the code. Remember that the VERBOSE setting isn't much help if you can't tell by looking at the LOG where on earth the code is executing.
dummy = "start of pm-xxScript.rc" ... dummy = "Now testing if we have control message XXX" :0 * condition { dummy = "Now testing if the command is YYY" :0 * condition ... } ... dummy = "end of pm-xxScript.rc"
If you need the value of some common headers, don't just call formail like this because the value may already be available prior your includerc. For example the user may already have needed the Subject value and stored it in a variable
[in pm-xxScript.rc] XX_SCRIPT_SUBJECT = `$FORMAIL -xSubject:' [User may have already read the content to SUBJECT] SUBJECT = `$FORMAIL -xSubject:' INCLUDERC = $PMSRC/pm-xxScript.rc Your pm-xxScript.rc launches an unnecessary formail call. Instead, use the existing SUBJECT. [user] :0 * ^Subject:\/.* { SUBJECT = $MATCH } ... XX_SCRIPT_SUBJECT = $SUBJECT # Note this!
INCLUDERC
= $PMSRC/pm-xxScript.rc ]
User should initialize the variable XX_SCRIPT_SUBJECT if he already has read the subject.
Add header X-Loop and test against it if you are sending an automated reply. The X-loop prevents responding to already responded message.
:0 * condition * ! ^FROM_DAEMON *$ ! ^$MYXLOOP { # Ok, now we're clear to send an automated reply }
# pm-xxscript.rc -- one line description string here # # File id # # Copyright (C) 1997-98 Foo Bar # # This code is free software in terms of GNU GPL v2 or later # # Description # # This subroutine Parses <what> from variable INPUT # # Required settings # # PMSRC must point to source directory of procmail code. # This subroutine will include
# # # # # # # # # # # # # # # # # # # # # # # # # #
o o
pm-xxScriptA.rc pm-xxScriptB.rc
Call arguments (variables to set before calling) o o o INPUT, the string from where to parse... VAR1, description, default is ... VAR2, description, default is ...
Returned values ERROR will have value "yes" if couldn't parse INPUT OUTPUT will have result after successful parse Example usage :0 * condition\/.* { INPUT = $MATCH INCLUDERC = $PMSRC/pm-xxscript.rc # OUTPUT has the result } Change Log: (none)
# ..................................................... &init ... dummy # # # :0 * ! { } # .................................................... &input ... # - User configurable variables with reasonable defaults # - But parameters like "INPUT" that must be set beforehand # are not mentioned here. VAR1 VAR2 = $VAR1{VAR1:-"default1"} = $VAR2{VAR2:-"default2"} = "init: pm-xxscript.rc start"
Read the standard variable definitions if they are not yet defined: that's "if WSPC variable does not contains space, as it should, then global variables haven't been read yet"
# .................................................... &do-it ... dummy <the code> dummy = "subroutine: pm-xxscript.rc end." = "subroutine: pm-xxscript.rc parses now that and that"
# end of pm-xxscript.rc
# by Lars Hecking <lhecking A T nmrc.ucc.ie> # MAJORDOM = "majordomo-(users|docs|workers)" :0 w *$ ^(Sender|To|Cc):.*\/$MAJORDOM *$ MAJORDOM ?? ()\/$\MATCH | $APPNMAIL $LISTS/$MATCH
Here is another, by Brock Rozen <brozen A T torah.org> with ideas from [dan]
# # # #
get the date in RFC822 format for insertion into some messages; the "Resent-Date:" field is copied from the "Date:" field on some systems. RFC1123 says "All mail software SHOULD use 4-digit years in dates..." = = = = "myList" "$LSIT_NAME [email protected]" `date '+%a, %d %h %Y %H:%M:%S %Z'` "$EMAIL" # my admin address
:0 fhw *$ !^X-List: $LIST_NAME *$ ^TO()$LIST_NAME | $FORMAIL -A "X-List: $LIST_NAME" -I "Resent-To: $LIST_ADDR " -i "Resent-Date: $LIST_DATE" -I "Errors-To: $LIST_ERR" -A "Precedence: bulk" -A "X-Loop: $COMSAT" :0 a ! -oi `cat /var/tmp/src/power-users.list`
\ \ \ \ \
[philip] Delivery mode is invoked using the -d flag. All arguments are the -d are user names. It is
usually used by the MTA to deliver mail to users, and indeed, procmail will return failure if it is given an invalid user name. In delivery mode, procmail reads /etc/procmailrc before the user's .procmailrc. Note: Procmail will work in delivery mode only if it is setuid root, if it is invoked with the ruid of the recipient named in -d, or, under certain OSes where the build routines have determined that it is safe, if the euid is that of the recipient and the egid is the recipient's login group. Mailfilter mode is invoked using the -m flag. It accepts only one rcfile as an argument other arguments are either variable assignments or arguments that are made availible to the rcfile itself as $1, $2, etc. If the specified rcfile is located under /etc/procmailrcs/ then procmail will take on the uid of the owner of that file. Otherwise, it will run as the user who invoked it. /etc/procmailrc, that procmail -d reads, is ignored. In mail filter mode, procmail unsets ORGMAIL and DEFAULT to suppress normal delivery reaching the end of the rcfile results in the mail bouncing. If the rcfile sets either of them then procmail will attempt delivery to that mailbox if it falls off the end of the rcfile; however, the mailbox will have to be writable by the uid/user that procmail is running as. Note: Only one rcfile can be named on the command line, but names of other rcfiles can be passed in the positional parameters to be used later in INCLUDERC assignments. Normal mode is invoked by not using the -m or -d flags. It accepts any number of rcfiles and variable assignments as arguments. Procmail runs as the invoking user in this mode. /etc/procmailrc is ignored. So, to answer your questions: if procmail reaches the end of the specified rcfile, it bounces the mail (/etc/procmailrc is ignored). Everything is up to the rcfile how to determine whether the address is valid and where to put the message if it is.
R$+ < @ $=a . > $* $#procmail $@ /etc/mail/procmailrc $: $1 < @ procmail > $3 R$+ <@ procmail > $* $1 < @ example.com .> $2
so this sends anything of the form [email protected] through procmail and rewrites it as foo@procmail. the procmail script reinjects it and it bypasses the call to procmail and then is rewritten back to [email protected].
The problem was that some mails run through the local mailer procmail and arrived all right (local mail), all mail from external (that dropped into my most used mailbox where I use a procmail-filter), did not arrive all right. This made me think it procmail, but these mails came from external and it was sendmail to blame.
[Jon Lewis <jlewis A T inorganic5.chem.ufl.edu>] Wouldn't you need write access to either /etc/aliases or /etc/procmailrc to setup mailing lists? Tell the ISP that procmail will greatly improve mail delivery and enable all users to filter out junkmail without ever seeing it. If they still refuse, find a better ISP.
> -rw-rw----
1 foo
After:
> -rw-------
1 foo
when the UMASK environment variable is more restrictive than the mode of the mailbox, procmail changes the mode of the mailbox. The default value of UMASK is 077. If you want to preserve the group access to your mailbox, I think you can set UMASK to 007 in the rcfile:
UMASK = 007
Furthe r note : the above UMASK sugge stion in .procmailrc doe s not work. S e e comme nt by Gje rmund S rse th <gje rmund A T ne xte l.no>
However the permissions on DEFAULT are handled before procmail even opens the .procmailrc, so changing the umask there will have no effect on the mailspool. [Scott J. Kramer <sjk A T lux.com>] it's documented in the MISCELLANEOUS of the procmail(1) man page:
I f /var/mail/$LOGNAME alre ady is a valid mailbox, but has got too loose pe rmissions on it, procmail will corre ct this. To pre ve nt procmail from doing this make sure the u+x bit is se t.
when it chmod's the file to 600. As you've discovered, this is inconsistent with the SYSV (Solaris 2
anyway) default mailbox protection of 660, gid=6 (mail). I think that's an OS-dependent bug, with the `chmod u+x ...' as the workaround.
[alan] I used to be the manager of the system support in the College of Engineering, at the University of California, Santa Barbara. We supported about 1500 users from two HP 9000 G30's, using one of them as the centralized mailer. Mail was available via NFS exported /usr/spool/mail to over 200 workstations, of all kinds: SGI, HP, Sun, etc. We replaced /bin/mail with procmail as the local mailer (Mlocal) because procmail correctly avoided NFS-locking problems, and it supported user-configurable mail filtering, without compromising system security. In over two years subsequent to the change, we had no loss of mail due to procmail being used as the local mailer. If you wish further comment from the current system managers, send mail to "postmaster A T eci.ucsb.edu". To answer your specific questions: * you can configure the permissions directly, by changing one of the following defines in config.h:
/* bit set on mailboxes when mail arrived */ #define UPDATE_MASK S_IXOTH /* if found set */ #define OVERRIDE_MASK (S_IXUSR|S_ISUID|S_ISGID|S_ISVTX) /* the permissions on the mailbox will be left untouched */ #define INIT_UMASK (S_IRWXG|S_IRWXO) /* == 077 */ #define GROUPW_UMASK (INIT_UMASK&~S_IRWXG) /* == 007 */
We did not find it necessary, however: We did disable all locking except dot-locking, since the kernel locks were the source of the NFS-locking problems. There have continued to be occasional locking problems, but these are "victim"-induced problems caused by using non-supported and discouraged mailers, such as "mailtool" from older Suns. These locking problems have nothing to do with mail delivery, but from the mail client using kernel-advisory locks, and then orphaning them or, leaving them locked all day long. An alternative to having users use .forward files, is to create a file of users who would like to use procmail as their local delivery agent, and use this file to initialize a class variable. Write a special rule in sendmail.cf which delivers mail using Mprocmail instead of Mlocal when the destination user is in the special procmail user class. This allows users who want procmail-direct delivery in spite of management worrying. I set this up to test procmail delivery on our system before changing Mlocal to use procmail. We placed some "volunteer" users in the procmail class file, and they never had any problems (I was one of them).
...I trie d to make a softlink to ~/.forward, but the n my procmail wouldn't run. Whe n I made a re al ~/.forward file , the n it worke d again. My que stion is why would procmail tre at a link to a file any diffe re ntly than the actual file itse lf?
ln -s ~/.procmail/forward ~/.forward
[Werner Reisberger <wr A T tribe.ping.de>] That's not a problem with procmail, this is an MTA issue. Due to security reasons sendmail will not deliver mail to files whicharesymlinks. [david] procmail has restrictions on what permissions it will tolerate on an rcfile. For example (I'm just guessing here) it can tell whether it can read the target file but it cannot tell who might be able to write to it. This prevents a major security hole You can make hard link to the file, since A hard link is completely indistinguishable from the original file. But note: a file hard-linked to two or more names is very distinguishable from a file with only one (hard) link, and procmail, for example, will not deliver to a plain folder that has two or more hard links. You can also put the real file at ~/.forward and let ~/.procmail/forward be a symlink to [<mikk0022 A T maroon.tc.umn.edu>] I suppose, the reasoning behind procmail's folder policy is that procmail locks the file by name, not inode. Hence it cannot guarantee mutual exclusion for access to a file which has multiple names. My understanding of the .forward policy is that a symlink need not share the permissions of its target. Therefore somebody's .forward symlink could have proper permissions, while its target could be writable by others. This would allow anybody with the write permissions to execute any program (potentially) from the user's forward file. Two hard links share the same permission, so this argument doesn't hold.
[david] Elie sent the answer to me with a carbon to the list, but since reading my personal copy my inbox got trashed. As of this writing the list copy hasn't reached me, but the rest of that sentence (as I recall from reading it before it got hosed) was to the effect that procmail is then never invoked at all on your incoming mail; a .forward takes precedence over the LDA. That scenario never occurred to me. Thank you for explaining.
[Philip] S cratch the bit about /e tc/procmailrcs/$LOGNAME. You're mixing up procmail -d with procmail -m.
Ah, got it ... after rereading the man page. The part about /etc/procmailrcs really can apply only when procmail is setuid root, so again it's something I've no experience with and never quite followed or retained. So no file in /etc/procmailrcs is ever used implicitly, but /etc/procmailrc can be.
[Philip] $HOME/.forward is handle d by se ndmail. I f you have a forward, the n se ndmail re write s atte mpts to de live r to you into atte mpts to de live r to the addre sse s liste d in the .forward file .
Or in other words, the .forward takes precedence over the LDA. Thank you both.
[1998-06-24 PM-L phil] The -t flag causes procmail to return EX_TEMPFAIL where it normally would have returned EX_CANTCREAT. If you've made procmail the local delivery agent then you should add -t to the A= define, before the -d flag.
/var/qmail/boot/proc+df, to /var/qmail/rc. [1998-11-10 PM-L Greg Boes <gboes A T ashfordtech.com>] From the qmail FAQ (4.4 How do I use procmail with qmail?) Put
| preline procmail
into ~/.qmail. You'll have to use a full path for procmail unless procmail in in the system's startup PATH. Note that procmail will try to deliver to /var/spool/mail/$USER by default; to change this, see INSTALL.mbox.
[philip] Get procmail 3.11pre7 and uncomment and and correct for your local setup the MAILSPOOLHOME="/.mail" define in src/authenticate.c. Compile and install. t's relative to the user's home directory. Thus the name MAILSPOOLHOME. [Ekkehard Knopp <knopp A T rz-online.de] at the qmail-home-page you can find a patch for procmail3.11.pre7 called procmail-maildir-patch. When you can't find it, I can send you a netmail. Have no problems with procmail and qmail. Works good.
[Christopher Lindsey <lindsey A T ncsa.uiuc.edu> 1998-03-09 PM-L] AFS is awesome! You just have to treat it nicely. :) The only viable solution that we've been able to come up with involves patching the procmail-3.11pre7 sources to "fake" user home directories out of another directory. For example, my home directory in AFS is
/afs/ncsa.uiuc.edu/.u1/lindsey/
It is kept as such on the mail server in /etc/passwd as well. However, we have some space set up via NFS in /var/forward with space for each individual user (so /var/forward/lindsey in my case). The procmail patch intercepts requests for the user's home directory and replaces it with the "fake" directory (the /var/forward one). So for all practical purposes, procmail things that my home directory is /var/forward/lindsey, and everything works fine.
unique "participate ID number" that you need to send back in order for the subscription to take in effect.
KILL_FILE = $PMSRC/.kill-immediately :0 *$ ? $IS_READABLE $KILL_FILE { KILL = `cat $KILL_FILE` } # 1) Make sure KILL has value # 2) if match is found from header. # 3) /dev/null does not need lockfile :0 * KILL ?? [a-z] *$ $KILL /dev/null
[sean] ...In the long haul, your best bet with dealing with this problem is to stamp out the offender bring this harassment to the attention of their ISP and get their account closed. Repeat as necessary. Most of the mailing lists should have some record of the submission request. Even if forged, the abuser probably has their IP address in the headers somewhere (and if the person is actively subscribing your friend to so many lists and actually WORKING at covering their tracks, apparently you've REALLY crossed them). Most people who stoop to these immature harassment tactics aren't bright enough to fully cover their tracks. Another alternative to having to manually deal with unsubs on certain lists is once you've identified filterable characteristics of the lists, BOUNCE them. Most semi-intelligent listserv implementations will unsub you if they get repeated bounces. Yea, not nice to the listserv maintainer - but then, if perhaps they'd implement a subscription verification system, it wouldn't have been a problem to begin with.
:0 * condition { # may expose your .forward - but if you're bouncing lists, # it probably doesn't matter much. EXITCODE = 67 # save header for examination. :0 h: bounce.log }
You've got a sticky situation. You can't simply ditch all unrecognized mail - you need to be able to review potential refuse first, and take action on anything which doesn't belong (because you certainly don't want to continue getting the non-wanted lists till the end of eternity - you should want to unsubscribe from them to simplify your mail).
[sean] One or the other should do the trick (or both even): Go to your login file (what it is named depends on the shell you're using), and add:
biff -n
COMSAT = "no"
[manual] has information on the COMSAT variable. It also states (contrary to reasoning I gave in above) that COMSAT defaults to 'no' if you specify an rc file on the commandline (otherwise, it is on by default). Doing this latter one should keep procmail from generating COMSAT/BIFF notifications, but would still leave your shell capable of receiving them, say, if you only processed certain mail in procmail manually or some such. Personally, I turn biff off AND set the COMSAT off. I read my mail when I read my mail, and I check it often enough (with a POP client at that).
[Xavier Beaudouin <kiwi A T oav.net>] Check your /etc/inetd.conf for a in.comstat, add a '#' at the beginning of the line, save the file and killall -HUP inetd. This should stop this ;-)
I've seen other posting about corruption of the From line. Perhaps you have the same problem. [Christopher B. Smith <cbsmith A T envise.com>] I had the exact same problem with my upgraded OpenLinux system. For the record, if you are running the imapd that comes with it, you should really set your permissions for the directory is as follows:
I got that feedback from the guy who wrote imapd, and it works very well.
[era] Perhaps preferably use ^TO_ if you have Procmail 3.11pre7 or newer. This is the classical case of using Procmail where you really need the envelope recipient information. The headers are not enough to determine who a message is for. If Procmail is your MDA, you can have this, but I'd still think something involving Sendmail would be more appropriate. For one thing, what if this user would suddenly really want to use Procmail? You can set DEFAULT and ORGMAIL for this one user in /etc/procmailrc to come around that, but the bottom line, as so many times before, is that Procmail might not be the right tool for this.
/var/mail stays at a Solaris 2.5 machine. Cucipop is working at the same machine. It's fine there. But, I want to have more than one machine with cucipop and when I put cucipop at another machines, NFS clients, it is delaying more 30 or 40 seconds to close the session.
[1998-06-23 PM-L Brad Knowles <brad A T colltech.com>] NFS mounting /var/mail is a good way to get bad performance, especially when you're doing any NFS writes. Even if you're not doing any NFS writes, just having to deal with local file locking and trying to translate that into NFS file locking is a nightmare (in general, file locking is one of the single biggest problems left with NFS).
> Procmail is working good on NFS, it finishes quickly. But when > cucipop is put on a NFS client, procmails starts to delay too.
Procmail probably isn't writing to NFS, or if it is, it's probably not using the same locking mechanism as cucipop. Unfortunately, each vendor and each program have their own ideas on how to best do
that.
[philip] cucipop was writte n by the author of procmail. I de ally, whe n you compile cucipop you e dit its config.h to use the locking te chnique s that procmail's autoconf proce ss de te rmine d for your syste m(s). Howe ve r, e ve n if you didn't do that, cucipop use s the same dotlocking algorithm as procmail.
Also, keep in mind that any POP3 server will have to copy the mailbox in order to work on it, and many of them copy the mailbox to /var/mail/.username (you got it creating lots of NFS writes). When they're done, they copy the mailbox back to /var/mail/username (after they copy any new mail messages that have come in to the end of /var/mail/.username and locked then truncated the original /var/mail/username file).
[philip] cucipop doe sn't use a te mporary file : it ke e ps it all in me mory. On de le te s it update s the mailspool in place which should ne ve r lose data, though if the se rve r crashe s in the middle of this you can e nd up with one or more bogus me ssage s.
This is a real nightmare when you start talking about users who select "Leave mail on server" and have multi-megabyte mailboxes.
[philip] Assuming you have e nough me mory, cucipop should be pre tty fast.
I think maybe now you're starting to understand why POP3 really doesn't scale well at all in multimachine environments (unless you've cooked up a custom mail store that uses a real database backend, like Oracle Parallel Server), with /bin/mail (or procmail) as a writable interface to this message store and POP3 and/or IMAP as a readable (and writable) interface to this same message store. Then you can let the database vendors deal with the hard data replication and distribution problems. Otherwise, it's a pain-in-the-ass.
Have you tried QPopper from Qualcomm? It's the single best POP3 server I've ever run across, although I wouldn't put even it in an NFS write environment. BTW, I used to be the Mail Systems Administrator for GNN (Global Network Navigator), the web site/National ISP co-operative between O'Reilly & Assoc. and AOL. At our peak, we had hundreds of thousands of registered users, of which up to five to six thousand were logged in at any one time, with their MUA set to check their mail every minute. We had a single primary Mail/POP3 server machine (Dec Alpha 2100 w/ four 250Mhz processors, 4GB RAM, 28GB hardware mirrored/striped mail spool), and one warm spare (same CPU/RAM configuration, physically hooked up to the same disks, but through DECsafe ASE not mounting them unless the primary died).
[david] The man page says that a variable capture recipe assigns the standard output of the command to the variable. Since you are repiping the output of formail and echo to sendmail, sendmail sucks up the standard output of formail and sendmail. Sendmail itself does not write to standard output, so the stdout of ( $FORMAIL -r ; echo $IM_NOT_HERE ) | $SENDMAIL -t is nothing. Thus you're assigning a null string to $LOG, and when procmail writes $LOG to the logfile you can't see a difference.
[stephen, <[email protected]>]. Remove fcntl() and lockf(), only allow flock() (or omit it completely) Kernel locks don't work. But that's all some programs use. Across a networked filesystem, lockf() doesn't work, fcntl() and flock() should, but they don't either because the lockd is buggy. Mailtool uses fcntl() but does it wrong, so that's another problem. The only thing that works on all platforms, all networks, all the time are .lock files. Makefile refers to:
# Uncomment (and change) if you think you know #LOCKINGTEST=100 # it better than the autoconf lockingtests. # This will cause the lockingtests to be hotwired. # 100 to enable fcntl() # 010 to enable lockf() # 001 to enable flock() # Or them together to get the desired combination.
/*#define NO_fcntl_LOCK uncomment any of these three if you */ /*#define NO_lockf_LOCK definitely do not want procmail to make */ /*#define NO_flock_LOCK use of those kernel-locking methods */
SUMMARY: Look at qmail, it's better than sendmail. [era 1998-08-15 PM-L] (Blows dust off old underutilized Bat Book/ORA sendmail book) Yeah, setting QueueFactor (q) and QueueLA (x) to suitable values should do what you want. You need to have load-balancing support compiled in, though; according to the Bat Book, sendmail -d3.1 tells whether you have it or not. (Mine just says getla:0 which I would imagine means I have the support but the load average was below the cutoff level.
AFAI K using load ave raging would have the first me ssage s de live re d and the re st que ue d. Howe ve r, also not be ing a se ndmail guru, I do not know how to e mpty a se ndmail que ue for incoming mail only. More ove r, e ve n if I kne w how to do this, it would have to be done afte r procmail finishe s.
[Liviu Daia <daia A T stoilow.imar.ro>] Instruct sendmail to queue messages when called from procmail:
then disable the normal sendmail d aemon from your system init scripts, and run it in flush queue mode only, that is, replace
/usr/sbin/sendmail -q 15m
("15m" is how often the queue will be run (15 minutes). Change it to whatever is appropriate for your purposes). Also make sure to disable forking in your sendmail.cf. The downside of this approach is that it will also delay the delivery of local messages. Different approach: pipe messages to sendmail instead of using '!' and use the wait flag. Something along the lines of:
Well, I'm actually not sure you can use the 'w' flag without 'f' (the manual doesn't say it, and I'm not too familiar with procmail internals), so if that doesn't work you might also try Sendmail will rewrite the From_ header (which you can probably safely ignore), and it will (optionally) add a From: if one doesn't exist, but it won't touch an existing From:. Well, actually it will encode or decode any 8-bit characters in the From: according to the options in sendmail.cf, but it won't change the meaning of the "From:". In fact, that's exactly what procmail does too in the '!' recipes.
:0 fw * conditions | $SENDMAIL $SENDMAILFLAGS <recipients> # dummy recipe to stop procmail from delivering an empty message :0 a /dev/null
[philip] I assume that by "target machine" you mean the NFS server for the given user's account. Procmail's attempt to read ~/.procmailrc will timeout, then when it tries to write to $DEFAULT (which you say is in their home directory) it'll time out (again) and return EX_CANTCREAT to sendmail. Sendmail will then presumably bounce the message. Now, if sendmail is looking for .forward files in user home directories, then procmail will never be called, as sendmail will try to open the .forward file and consider it a transient error when it times out, causing the message to be queued for a later delivery attempt. (Note: invoking procmail with the -t flag causes it to return EX_TEMPFAIL instead of EX_CANTCREAT. This would cause the message to be requeued. However, this is not generally recommended.)
[philip] Btw, All the versions of /bin/mail (or mail.local) that I've seen the source for either read the entire message into memory first or use a temp file. Depending on where temp files are located, a 90MB temp file may be just as bad as holding it in memory. And, No, there isn't. Hacking it in would not be non-trivial, mainly because the current code runs with the assumption that the entire message is there, and determining when it actually needs to see the entire body (to do demand loading) would not be easy. Remember that a condition on the size of the
message, ala
would require the body to be read... It really is just better to simply have sendmail enforce the limit. You should be doing it there anyway to cut down on the totally trivial denial-of-service attacks and because it's more efficient.
...I am running procmail ve r 3.11pre 7 and I ke e p ge tting "out of me mory as i trie d to allocate 8xxxxxx byte s.". I have ove r 100 me g available swap space so i have a difficult time unde rstanding this. I s this a known e rror?
Procmail's memory allocation technique appears to non-optimal for some OS/libc combos, namely implementation of the libc system function realloc() (FreeBSD has been reported). It's conceivable that the configuration process could be enhanced to detect this system limitation to use a strategy more efficient on them. Don't hold your breath.
[10239] Sat Jan 9 08:49:02 1999 Out of memory "formail" " formail -A "X-Check: List"" **Bounced** 5744 Notified comsat: "bhoule@:**Bounced**"
I f I act quick e nough whe n this happe ns, I can look in spool/mque ue and find a me ssage with a gazillion addre sse s in the To: line . S o it se e ms that formail is having trouble adding my X-Che ck he ade r to an alre ady large se t of he ade rs.
[philip] No, it's procmail that's unable to allocate enough memory. The buffer dumps indicate that procmail was unable to get enough memory somewhere between parsing the action line and reaching the next recipe buffer 0 would not contain the string "formail" if procmail had gotten to another recipe or variable assignment. What's weird is that the message is so small (only 5744 bytes according to procmail). Do you only see this error on this recipe, or at random places in your .procmailrc? If the later, then I would guess that your mailserver is running out of memory for some other reason and that procmail happens to be an innocent bystander. If the former, then, well, I'm not sure.
The me ssage is ne ve r de live re d to me . I s the re anything I can do so that procmail/formail will act as if it was ne ve r the re so the incoming dumps into my inbox rathe r than re turning an e rror to the maile r? This "*Bounce d*" busine ss is not a ve ry he lpful action.
Giving procmail the -t flag will cause fatal internal errors that are normally returned as permanent errors to be returned as temporary failures instead. Otherwise there's no way to control that. (Setting EXITCODE won't work because procmail needs to malloc memory to handle TRAP and EXITCODE, and it'll refuse to try that when it was malloc that caused the exit.)
[david] DEFAULT is initially defined as equal to ORGMAIL. Once procmail has started reading /etc/procmailrc (if it is the MDA) or your .procmailrc, you can change the value of either without affecting the other. In fact, you can even set DEFAULT on the command line when you invoke procmail (I'm not sure about doing that with ORGMAIL, though), and that value will override its normal initial value equal to ORGMAIL. What if it is possible that dropping to DEFAULT fails due to disk full? Then you would better have
another drop place in another file system. Peek at bdf(1) or df(1) to find out the different mounted file systems.
# Place this to the end of your .procmailrc and define # DEFAULT_SECONDARY :0 : $DEFAULT :0 E $DEFAULT_SECONDARY
If you deliver explicitly to $DEFAULT, procmail treats it like any other save-to-folder recipe, and if the write fails, it continues reading recipes.
...I f I had se t the "de live r" de stination as ORGMAIL rathe r than DEFAULT, would it have made any diffe re nce ?
Nope. If you write a recipe for it, procmail just expands the variable and doesn't give a heck if it happens to be the same destination as DEFAULT or ORGMAIL. DEFAULT is special to procmail only when it uses it on its own after falling off the end of the rcfile; ORGMAIL is special only at startup (without m) and when procmail falls off the end of the rcfile and finds that it cannot save the message to DEFAULT.
I n ge ne ral, if procmail falls off the e nd of the rcfile , fails to save to DEFAULT, and the n fails to save to ORGMAIL, doe s it re ve rt to the compile d-in value of ORGMAIL ?
[philip] Procmail has no fallback beyond the current value of ORGMAIL. If delivery to both DEFAULT and ORGMAIL fail, then procmail gives up and exits with error code 73 (EX_CANTCREAT) or 75 (EX_TEMPFAIL), depending on whether the -t flag was given. Setting EXITCODE would probably override those. The message is logged as "*Bounced*".
That is, it tries to deliver to $DEFAULT and if it can't, it tries $ORGMAIL. If that fails too ("deep, deep trouble" as Stephen says in the man page), it exits without delivery and reports failure to the MTA, which, depending on other factors, will either requeue the letter and try delivering later or will bounce it to the sender.
[philip] DROPPRIVS only has an effect inside the /etc/procmailrc used when procmail is running in delivery mode (-d), not when it's running in mail filter mode (-m). USER and LOGNAME have no effect on the working of DROPPRIVS, as procmail is just going to change to the uid/gid of the user specified on the command line after the -d. Your mailtable entry should be specifying the procmail mailer, which runs procmail in mail filter mode. If the following are true:
procmail is running in mail filter mode no assignments were given on the command line the -p flag was not specified the rcfile specified is located under /etc/procmailrcs/ without backwards references ("/../"s) the rcfile is not a directory (duh!) then procmail will assume the uid and gid of the owner of the rcfile. If the rcfile is actually a symlink, the procmail will assume the uid and gid of the link itself, not the underlying file. If your OS allows anyone to give away ownership of files with chown, the procmail adds the following restriction to those above:
CONTENT CONTENT
# Won't work # ok
But accessing other user's home is another story. You could change the SHELL temporarily to get procmail understand the reference, like this:
Because the tilde is in $SHELLMETAS, when procmail sees a tilde, it will invoke a shell. It's better to skip the extra process of a shell and use the $HOME variable: put a symlink somewhere under your own home directory that points to the other user's file so that you can use the $HOME variable in your .procmailrc and avoid the shell invocation. However, there are dangers on this too, because sysadm may move home directories and your symlinks may be out of date. If you expect such changes and broken links, then you could cache the needed home directories at time you need them:
HOME_PHIL HOME_ED
HOST = some.specific.machine
assignment, followed by code for mail delivered on that machine. If the first nine characters of "some.specific.machine" matched the real value of $HOST, procmail would stay in that rcfile; on a mismatch, it would jump to the second rcfile named on the command line. The second rcfile would probably be for another particular machine, so (unless it first had some universal code for all machines except the first one, or unless there were only two machines where procmail might run) right at the top it would have
HOST = this.specific.machine
Again, a match for the first nine characters would keep procmail reading this rcfile, but a mismatch would make it jump to the next rcfile. And so it went. An incorrect HOST assignment (note that "HOST" alone attempts to unset the variable, so it is always an incorrect assignment) in the last rcfile on the command line made procmail drop the message and exit. Since we almost never name more than one rcfile on the command line now, attempting to unset HOST in .procmailrc will have that effect. I would guess that the only use of this original setup still around is in SmartList, where flist invokes procmail with a number of rcfiles on the command line and uses things like HOST=go.to.the.next.rcfile.now to move from one to the next. Also, procmail's -m facility (which didn't exist back then) is incompatible with using HOST to jump among rcfiles, because it requires naming exactly one rcfile on the command line. Nowadays we can do something like this to use different rcfiles on different hosts:
Note: Beware of simply setting LINEBUF to a huge value: such an assignment causes procmail to immediately allocate twice that much memory (procmail has two buffer internally of size $LINEBUF). [philip] Those 160 lines of condition are almost certainly overflowing LINEBUF. You should either a) use one of the innumerable recipes sent to the list demonstrating the use of fgrep; b) break it into multiple recipes; or c) increase LINEBUF. If you modify this list of domains regularly, then you should strongly consider (a), as (b) and (c) just put off it happening again. LINEBUF only applies to lines from procmailrcs. You generally only have to worry about LINEBUF when you have a variable expansion or command expansion (back quotes) that doesn't have an obvious and reasonable bound on its size. procmail will avoid over running its LINEBUF length buffer when doing command expansions by ignoring the extra output, so you're safe there, as long as data truncation is fine. Variable expansion isn't checked like that, so you can cause procmail to core dump by doing something like:
then feeding procmail a message with a huge Subject: header field: since no shell meta characters appear in the action, the action line will be expanded and exec()ed by procmail directly instead of by the shell. On the other hand, the following is fine:
The semicolon forces a shell invocation, and the shell should be safe. If your /bin/sh can buffer overrun on variable expansion, then you're in more trouble than you know. Action lines aren't the only place to watch your variable expansions. Variable assignments and condition lines that have a leading dollar sign also undergo expansion. For example, this isn't safe:
procmail won't buffer overrun in the first line, but a really long subject could cause the second to do so. The following should be safe:
but even then only if you're sure the shell is doing the expansion of NEWSUBJ. Note that matching against the value of a variable (using the "var ??" condition special) is safe no matter what the size of the contents of the variable. The problem is when you interpolate the variable into something else.
I s the re any e asy way to know de fault LI NEBUF value for spe cific procmail? I 'm sure the re 's a much e asie r way, but this will work:
# Mitsuru Furukawa # $OUT = $HOME/tmp/linebuf.lst :0 wc: $OUT$LOCKEXT *$ ! ? $IS_EXIST $OUT | echo "$LINEBUF" > $OUT
[philip] If you examine the procmailrc manpage, you'll note that it lists fourteen variables (among them DEFAULT but not LINEBUF) whose values are reset in the environment by procmail, plus some additional ones like IFS, ENV, PWD, and PATH which come out of the top of config.h. Following this is a list of all of procmail's magic variables, including those fourteen. The idea is that while procmail has thirty magic variables, only fourteen of them are put into the environment by procmail. The others may have default values, but they're 'input only': if what you're doing depends on one of the others having a certain value, then you should just go ahead and set it to that value. I know of only two ways to find out what value procmail is using by default: a) check th e manpage (the manual pages should show the correct default for the machine), or b) fire up your favorite debugger and hope that no one stripped the procmail binary. There will be no error message when Procmail dumps core, even though the reason is apparently
[david] Yes, both before and after variable expansion and command substitution, it must be shorter than LINEBUF characters. The exceptions are (1) comments and (2) commands that are run by a shell rather than directly by procmail. The entire condition must be under LINEBUF characters Unfortunately, LINEBUF seems to be a write-only variable; you can change its value but you can't find out its current setting.
LOG = " This message goes to LOGFILE" LOG = " $NL$NL And this has linefeeds around $NL$NL"
Or like this, which proves to have some nice feature in respect to VERBOSE setting:
dummy = " This message goes to LOGFILE" dummy = " $NL$NL And this has linefeeds around $NL$NL"
You see, if you set VERBOSE="off" Then the dummy lines are not printed and recorded to the LOGFILE. LOG messages are aways printed, and that's not very nice if you're trying to suppress messages while you call some subroutine:
saved = $VERBOSE VERBOSE = "off" # # Hope this subroutine does not use LOG Eg. $PMSRC/pm-jaaddr.rc
TRAP = 'echo " FROM $FROM TO/CC $TO / $CC SUBJECT $SUBJECT FOLDER $LASTFOLDER " | sed -e "s#FOLDER
/dev/#FOLDER
#g"'
And if your MUA expects the file to be touched before it sees new incoming mail, here is recipe by
[david]:
Place it early in your rcfile; then each recipe that saves to a directory can look simply like this, and the trap will take care of the touching:
[david] Procmail terminates when it exits ... after final delivery of the message. It doesn't terminate (nor execute TRAP) after delivering a copy to a c recipe [however, a clone does execute TRAP when it terminates, unless you unset TRAP for it]. It doesn't execute the trap after a variable assignment, a variable capture recipe, a filtering recipe, nor any other non-delivering action. On the other hand, it does execute the trap if you do a quick bail-out by unsetting or missetting $HOST. [Recipe to record Subject lines on exit] [Some Message had doubled Subject lines; david] ...this will list all subject lines in the log file upon exit if there are two or more. The earliest would appear twice: once in the trap output and once in the logabstract.
:0 * ^Subject:.*$(.+$)*Subject: { # If there is already `TRAP' set, combine the # old trap recipe with this TRAP = "${TRAP:+$TRAP ; }$FORMAIL -XSubject:" }
[era] Procmail does interpret UMASK this way, so this works, but I don't think it's a particularly good solution. It's actually hinted at in the documentation for UMASK in procmailrc(5). find is a rather heavy program to start up every time you want to look for mail. (Haven't done any timings, though.) I just grep -c '^From ' on my mail folders to see how many messages there are in them. (This is only an approximation, in the case where one or more messages contain unescaped From_ lines.) For a really pedestrian solution, keep all your spool files in their own directory (I think this is a good idea for other reasons as well) and do an ls -lrt on that directory, possibly piped into a sed script to trim off files with time stamps older than, say, 24 hours. If your mail reader will reset permissions on spool files when it gets mail from them, the UMASK
trick is a good base for a mail checking script, but I would then only ls -l the spool files and look for files with an x01 permission.
[philip] That's a feature, not a bug! To quote the procmailrc(5) manpage: UMASK: The name says it all (if it doesn't, then forget about this one :-). Anything assigned to UMASK is taken as an octal number. If not specified, the umask defaults to 077. If the umask permits o+x, all the mailboxes procmail delivers to directly will receive an o+x mode change. This can be used to check if new mail arrived.
Anyhow, normally, unde r Unix, the cre ate syste m call will se t de fault pe rmissions of 666 and the umask can only be use d to mask off the bits you don't want (and not to e .g. add x bits). S houldn't Procmail work this way, too, just to be consiste nt with the re st of the syste m?
creat() will set the permissions to whatever you want it to, modulus the umask. If the umask is zero, you can set the permissions to 7777, though that would be kind of stupid (and actually, most versions of UNIX won't let you set both the sticky bit and an executable bit unless you're root, for historical reasons). Most programs that call creat() or open(..,O_CREAT,...) give a mode argument of 0666, as they generally don't write out executables. Procmail just happens to call open() with a mode argument of 0667, to be modified by your umask.
# Side effect: Do something with shell dummy = `echo hi there > some-file.txt` :0 hwic | echo "hi there" > some-file.txt
Procmail sends whole message to first line and only headers to second recipe. Answer: It doesn't matter. Either way procmail will make one write system call which will return 0 [bytes written] and off it goes. You should use the first one, because the latter affects the A and E flags later, first one is more clear overall. While someone suggested following, it was rejected because it hurts performance more [stephen]. The cat process is useless and directing to dev null does not buy anything.
/disk3/home/foobar/Mail 119) ls -la backup total 22 drwx------ 2 stanr 512 Nov 11 21:00 . drwx------ 3 stanr 2560 Nov 11 21:11 .. -rw------- 1 stanr 3063 Nov 4 03:31 .nfsA0c724.4
-rw------- 1 stanr 1780 -rw------- 1 stanr 849 -rw------- 1 stanr 2293 -rw------- 1 stanr 2598 -rw------- 1 stanr 3127 -rw------- 1 stanr 1884 /disk3/home/stanr/Mail 120)
3 3 11 11 11 11
[david] procmail uses temporary name while it is trying to write a file out, which it renames if things go well. I noticed that they all came from a 4h 31 span overnight; perhaps there was some systems work being done on your machine that screwed things up?
[aaron] When a file that is being used by a program on an NFS client gets unlinked the NFS server renames it to something like that. It should then actually get unlinked when the file is closed, but it looks like the NFS server never got the close message for those. [Keith Pyle <keith A T ibmoto.com>] It is a result of using NFS, but the fault lies with the operating system on the NFS client. Keep in mind that NFS is stateless from the perspective of the NFS server. It keeps no information on how any file is being used. So, if a client tells the server to delete the file, the server deletes the file. This is not normally a problem, but many programs use a "trick" of Unix where the program opens a file, unlinks (deletes) it, and then continues to use the file. For all local files, the Unix kernel will not actually delete the file until all processes which have the file open exit. This works very well for temporary files. If a client tells an NFS server to delete a file, it will delete the file immediately because of the stateless nature of NFS. The server has no way of knowing if any client still has the file open. To avoid this problem, if a client unlinks an open file on an NFS filesystem, the file is renamed to .nfs* where * is a unique value. The NFS client system is supposed to delete the .nfs* file when the process exits. However, there are some versions of Unix which do not do this well (e.g., AIX). If one of these OS's is used, it is common to find .nfs* files in various places. Therefore, it is a good idea for system administrators to periodically purge any .nfs* files over a certain age to eliminate the unsightly buildup in the filesystems.
19.39 Parameter $@
[david] Of version 3.11pre7 procmail does not grok "$*", nor does it grok "$@" outside a pipe or forward action. The only way to get the positional parameters all quoted together into "$*" is something like this: This doesn't work after all
:0 ir ARGS=|echo "$@"
After that you use "$ARGS" instead of "$*". If you try to set ARGS with ARGS="$@", procmail doesn't substitute for "$@" and makes $ARGS null.
If you try ARGS="$*" you get the literal text '$*'. [philip] Of course, $ARGS differs greatly from $@ in that $ARGS will either be split on whitespace (if unquoted) or one argument (if double-quoted). $@ has the cool property that if double quoted it'll still be split into multiple arguments on the original argument boundaries. Since full-blown mail addresses often have spaces, this distinction should not be casually dismissed. Note that while you might not type in such an addresses, your MUA's reply builder may.
[philip] It won't work as expected. The problem is that environment variables (and therefore procmail variables) are null-terminated, and therefore cannot contain a null. The above line creates an empty variable. The solution is to use an inverted character class:
Note that procmail handles 8-bit characters except for null in procmailrcs, so you can use a literal control-A and octal-377 in your .procmailrc and save an echo and shell invocation right there.
[phil 1996-03-21] The difference is that ^TOalias1@site may match something like bobs-alias1@site while ^TO_ won't. [elijah 1997-09-16] Let's rewrite that in perl /x format. See below. The definition of the word boundary in block (E). See below. The ^TO_ expansion was added in v3.11pre4. You'll probably have to just ^TO (no '_'), which should work almost as well.
/ ( ^ ( (Original-)?
# # # # #
[begin regexp] [Block (A)] Anchor to start of line [Block (B)] Optionally proceed (C) with "Original-"
) /x
# Optionally proceed (C) with "Resent-" # [Block (C)] # "To" # or "Cc" # or "Bcc" {very rare in practice} # [end (C)] # [Block (D)] # Proceed line 17 with "X-Envelope" # or "Apparently" # with optional "-Resent" appended # [end (D)] # "-To" [line 14] # [end (B)] : # ":" ( # [Block (E)] .* # any text # any single char other than letters, numbers, [^-a-zA-Z0-9_.] # hyphen (-), underscore (_), or period (.) ) # [end (E)] ? # Block (E) is optional # [end (A)] # [end regexp]
[by Vikas Agnihotri <vikas A T insight.att.com>] Block (E){see TO_ macro explanation} is there to slurp up that part. The <encapsulation> is not needed, and a case such as:
Will confuse a test for "^TO_jester@". Yes, I have seen people do that stuff, apparently not even maliciously. And although valid following is also valid
[Elijah continues] it will also confuse the regexp. I don't like the ^TO and ^TO_ macros for most things and typically use stuff like this:
It still can be confused, but the things that will cause problems are fairly rare in practice. You might prefer something like this:
To: ([email protected]) {address} To: (fake {address}) [email protected] To: Alice <em>[email protected]</em>, "W. Rabbit (late)" <em>[email protected]</em>, Gentle Reader <{address}> To: [email protected], [email protected], [email protected], {address}, [email protected]
(^(Precedence:.*(junk|bulk|list) |To: Multiple recipients of |( ((Resent-)?(From|Sender)|X-Envelope-From):|>?From ) ([^>]*[^(.%@a-z0-9])? ( Post(ma?(st(e?r)?|n)|office) |(send)?Mail(er)? |daemon |m(mdf|ajordomo) |n?uucp |LIST(SERV|proc) |NETSERV |o(wner|ps) |r(e(quest|sponse)|oot) |b(ounce|bs\.smtp) |echo |mirror |s(erv(ices?|er) |mtp(error)?|ystem) |A( dmin(istrator)? |MMGR |utoanswer ) ) ( ([^).!:a-z0-9][-_a-z0-9]*)? [%@> ][^<)]*(\(.*\).*)? )? $ ([^>]|$) ) )
[^<)]*
(\(.*\).*)? )? $ ([^>]|$)
End of e-mail address token Another alpha token ... or maybe not; Address separator -- either <em>address@...</em> or <address> or a bare address with whitespace around it Skip as long as we don't run into another bracketed address or end of comment (presumably to prevent this from matching inside parenthesized comments in the first place) Skip optional parenthesized comments and anything after them if found ... or maybe not; maybe we just see an ... ... end of line instead Uh, I should know what this is supposed to do, but I can't quite remember what it's for. I think it had something to do with continued header lines ... Anyone?
[david 1998-04-29] Apparently not, but it does match on the UNIX From_ line, which usually contains the same address as the Return-Path: header.
Doe s anyone have an ide a how I can use this macro but te ll it to ignore the Re turn-Path line in the he ade r?
There's probably some way within procmail without the extra fork of formail, but this is easy to think of and easy to write:
EX_OK EX__BASE
0 64
EX_USAGE EX_DATAERR EX_NOINPUT EX_NOUSER EX_NOHOST EX_UNAVAILABLE EX_SOFTWARE EX_OSERR EX_OSFILE EX_CANTCREAT EX_IOERR EX_TEMPFAIL EX_PROTOCOL EX_NOPERM
64 65 66 67 68 69 70 71 72 73 74 75 76 77
command line usage error data format error cannot open input addressee unknown host name unknown service unavailable internal software error system error (e.g., can't fork) critical OS file missing can't create (user) output file input/output error temp failure; user is invited to retry remote error in protocol permission denied
I thought that by using the EXI TCODE, I would be assure d that the mail would be re je cte d but in fact S e ndmail 8.8.7 atte mpts to de live r the "use r unknown" to ne tcom.com, which is obviously wrong?
[sean] Sendmail accepts the message, then passes it on to Procmail, either as the local delivery agent, or via a .forward file (depending on your system's configuration). Procmail says "gee, gotta lie about not being here" and rejects the message, when is sent back into the spool, and delivered according to who it appeared to come from. Had SENDMAIL determined the user didn't exist (password file / aliases / virtusertable.txt), then it would have rejected the message right when the remote was doing SMTP RCPT. But the user WAS valid, and so it accepted it. Another scenario is when you have a mail secondary, and your primary (where the user account and procmail are) is down. Some system goes to deliver mail to you, and resolves to your secondary which simply holds mail for your primary it hasn't a clue which user is valid and which isn't. Well, the (E)HELO (the system sending your primary the message) takes place during the SMTP session, the message is coming from your secondary - not from the original sender. At THAT point, if the user didn't exist, I believe sendmail would be issuing an unknown user error to the secondary, which in turn should mail that message back to who it thinks is the sender (I can't check my Bat book from where I'm at - any sendmail pros are welcome to elaborate).
is the re any way at all to ge t around this (force the re je ction at de live ry time )? Be tte r ye t, is the re some sort of che ck to make sure that the Re ce ive d domain re asonably matche s the From: domain?
You'd need to have a ruleset in your SMTP Daemon (generally Sendmail) to check domains (which WILL fail on many valid messages, BTW) and reject it WHILE the SMTP delivery session (actually, the negotiation) is in progress. By the time Procmail has the message, you've completely accepted the message, and any rejection you might hope to do is bouncing the mail - to the apparent sender. Such is the problem with forged mail. I wouldn't suggest this tactic for fighting spam anyway - so much of it is forged, and any bounce you send out simply uses up system resources on your machine and those on the system that was spoofed. Spammers don't REMOVE addresses from their lists (they want the lists to look as big as possible when they go to sell it to someone else) some have even taken to GENERATING addresses at domains and sending messages to them with the assumption that somebody will probably have an account by that name ("bill@ joe@ dave@ ..."). Use procmail to trashbin (or otherwise file) all the junk and then manually take action on those which get through.
first-class
30 60 100 100
[dan] You should use bulk when you distribute files via File Server. The value in the Precedence: header says absolutely NOTHING about the contents of the message itself, it merely suggests a priority level to the mail system. From pp. 668 of the O'Reilly's sendmail book, bulk typically has a value of -200 while junk -100; thus a message with junk will get higher priority than that of bulk (although this can be changed in the sendmail.cf file). Other than on heavily loaded machines, this value won't matter anyway, since all mail will be quickly processed. [Stephen] ...Mail sent by a person is usually considered to be more important than autoreplies generated by some daemon. One way to express the lower priority of autoreplies is by adding a "Precedence: junk" field. This allows mail transport agents to make educated decisions about which mail to forward first (in case the mailqueue gets clogged). Another point is: other autoreply services, like vacation. They try to make an effort not to accidentally reply to a message generated by another daemon (e.g. yours). One way they detect this is by looking at the Precedence field. If it contains junk, they know, this is not something we should respond to.
:0h * condition * !^X-Loop: foo@site\.com | ($FORMAIL -rA "X-Loop: [email protected]" ) | sendmail -oi -t
[david] That's not a problem, because formail -r will not generate any Cc: or Bcc: headers unless you tell formail to add them. The only line where sendmail -t will look for recipients will be the To: line.
You need a program which will fetch your e-mail from the IMAP server and then feed it to procmail. One such program that can do this is fetchmail. Check out http://packages.debian.org/fetchmail The bad news is that once you do this, you probably won't be able to use an IMAP client to read your email anymore. But that might be good news if you prefer an MUA that reads mbox files but doesn't grok IMAP.
[era] The following should tell you the name of the machine which processes mail for the machine you're asking about. You can then try to log in to that machine if you have shell access there, which is something you need to have in order to compile Procmail on it.
If you don't have nslookup (doh) or don't understand what it says, try adding this to your .forward
"|uname -a >/full/path/to/home/.uname.out"
i.e. this should be there in addition to what else you do. Otherwise this will lose your mail thoroughly, since it reads the mail but doesn't save it anywhere. You might want to save a copy of all incoming mail to a safety mailbox, too, just in case. Like so:
/full/path/to/home/safetymailbox |"uname -a >/full/path/to/home/.uname.out" |"IFS=' '&& exec /usr/local/bin/procmail -Yf- || exit 75"
If you try this, it is very important that the file safetymailbox exists and is writable. (man 5 forward if you have that I don't seem to have this manual page on systems with newish versions of sendmail, is that correct?) Try the uname command (and/or read the manual) to see what you should expect to find in the file .uname.out
[philip] If incoming mail is supposed to be stored in /usr/mail/loginnamehere, then you should not define MAILSPOOLHOME at all, but rather define MAILSPOOLDIR to "/usr/mail/" and leave MAILSPOOLHASH as 0. Defining MAILSPOOLHOME causes mail to be delivered to insides each user's
home directory, which does not appear to be what you want. MAILSPOOLHASH causes addition levels of hierarchy in the spool directory to be created, thus avoiding the 'fat slow directory' problem.
Emacs refers to a programming platform (it's not only a text editor, or a programming editor, but it does almost everything you tell it to do except make your coffee) which can be found almost in any Unix platform. Nowadays Emacs is also available for the PC platform too. There are two flavors to choose from: Emacs, maintained by the FSF (Free Software Foundation), and XEmacs, sometimes called "Emacs the next generation", because it has a better graphical user interface (gui) and internally advanced OO design (it can highlight on tty, whereas Emacs can't). XEmacs is being maintained by group of programming wizards. Emacs add-in packages are lisp and the lisp file extension is .el. Inside each package one finds instructions how to use and how to install the package into Emacs.
Case2 : procmail and Gnus. The mail is always delivered to procmail first. Procmail is free to put the mail anywhere or just let it drop to the user's default inbox, usually pointed by environment variable $MAIL.
mail -> procmail --> Post processing with Gnus [the ~/Mail/spool] --> split1.mbx --> split2.mbx [The default procmail rule drops to inbox] --> $MAIL
You can let gnus to process the messages further: like moving messages from one inbox to another. Summary If you use procmail, the incoming messages are immediately categorized. The incoming mail is put in the folder of your choice. The mailboxes are there waiting for you all the time. You can use less or more to view them in a hurry.
If you don't use procmail and let Gnus to do all the splitting, you always see one huge inbox, $MAIL. It will not be split until you fire up Emacs and Gnus. If you're in a hurry, you may not have time to start Emacs & Gnus, before reading the important messages. Your only option is to read all messages in $MAIL and try to find the ones that consider e.g you work. So, let procmail drop messages to their inboxes and Gnus to possibly "fine process" these inboxes.
The file name must be list.xxxxx.spool in order to `nnml' to work in Gnus.Define procmail mailing list
PROCMAIL_SPOOL = $SPOOL/list.procmail.spool # # GNUS must have unique message headers, generate one if it isn't there. By Joe Hildebrand <em>[email protected]</em>
:0 fhw | $FORMAIL -a Message-Id: -a "Subject: (None)" # detect mailing lists and store messages to spool directory
Copy the Lisp code below to your ~/.gnus Start Gnus with M-x gnus-no-server (M-x means ESC followed by x). You will see Group buffer to appear. Make the new group with G m list.procmail RET nnml RET. You can read the group as usual and query new mail with g command.
(setq gnus-secondary-select-methods '((nnml "")) ;; See also nnmail-procmail-suffix which is .spool by ;; default ;; nnmail-use-procmail t nnmail-spool-file 'procmail nnmail-procmail-directory "~/Mail/spool/"
nnmail-delete-incoming
t)
And then I have procmail always deliver to ~/Mail/spool/. If you add more inboxes, create them inside gnus Group buffer with G m.
(setq gnus-permanently-visible-groups
"^nnml\\|^nnfolder")
If you made a mistake and wrote list.procmaill with an extra l accidentally in the group name, use G r to rename group. Raise or lower the priority of your procmail mail groups with S l. Values 1 or 2 or 3 are good. Consider reserving 1 for your primary mail and 2 and 3 for mailing lists. When you exit a group and have read some articles, they won't show up next time you go there. But by giving prefix argument before entering the group with SPC, Gnus will list all read articles. You give the command like C-u SPC, where C-u is the prefix argument. Settings You want gnus to tell you everything it does
;; 0..10
You expire articles (get permanently rid of them) with the 'E' command in the Summary buffer. The default expiry time is 7 days. You can define the expiry time in days with
(setq nnmail-expiry-wait 7)
If you read mailing lists, you want automatic expiry when you have read the article. Use the following to set up groups that use this automatic expiration.
B e in the Summary buffer expires current expirable articles. If you want to kill an article; permanently remove it from disk, use B delete. If you want to mark an article as persistent (never expires), use * You don't want these mail groups cached because mail is already in "cache" format. The cache is needed only when you read newsgroups and want to store messages locally.
~/Mail/spool/list.procmail.spool ~/Mail/spool/mail.procmail.spool
Let Gnus read the old file as usual. Press g read new mail to list.procmail. list.procmail.spool will now be empty and merged to nnml backend file nnml:list.procmail. Make a new group with G m nnmail mail.procmail in Group buffer. Go to the old list.procmail group and select all articles with M P a. Move the messages with B m to mail.procmail. You will see G marks appear to the beginning of moved articles. Exit the Summary buffer and hit g to see that the messages hat were transferred to your new mail.procmail Kill the old group list.procmail with G DEL One more thing, remove that empty spool file. It is no longer used for anything.
% rm ~/Mail/spool/list.procmail.spool
I 'm also a bit confuse d with the propose d solution of having procmail filte r incoming mail in a nnmail-procmail-directory inste ad.
You have Procmail stuff mail in spool files, pre-sorted and filtered. Gnus then picks these up and stuff the messages in the appropriate groups. Gnus uses movemail to actually move the mail out of the spool, and movemail uses locking that Procmail understands, so there is no danger of mail loss.
Why are nnfolder-directory and nnmail-procmail-directory two diffe re nt dire ctorie s if nnmail-procmail-directory will contain the mail boxe s that procmail appe nds to and nnfolder-directory is suppose d to be "All the nnfolde r mail boxe s will be store d unde r this dire ctory"?
Because Procmail should stuff its mail in different folders, not in the ones that your regular mail is stored in.
I s the ide a to have Gnus use nnmail-procmail-directory as a te mporary dire ctory that it draws from to proce ss and the n de posit nnfolde r mailboxe s in the nnfolder-directory ?
nnmail-use-procmail t) nnfolder-directory "~/gMail/") nnmail-spool-file 'procmail) nnmail-procmail-directory "~/incoming/lists/") gnus-secondary-select-methods '((nnfolder ""))) nnmail-procmail-suffix "")
Procmail is adding incoming mail to ~/incoming/lists/listname. The nnfolder groups I subscribed to are named "nnfolder:lists.listname" Gnus does create the ~/gMail/lists directory with a zero length file in this directory for each list, but doesn't move any mail over and so it thinks I have "No more unread newsgroups".
(nnmail-get-spool-files)
After much experimentation, I finally got movemail to work. I changed nnfolder-directory to "~/gMail/lists/" and Gnus now moves mail from "~/incoming/lists/" to corresponding groups in "~/gMail/". My problem seems to be solved, but still these workings seem counter-intuitive to me. By what the manual has to say about nnfolder-directory I would think Gnus should build the nnfolder groups in "~/gMail/lists/" instead given my definitions. I think nnmail expects the "~/incoming/lists/whatever". spool files to be called "~/incoming/lists.whatever", not
I thought you said the groups were called "lists.whatever"? So the spool files were called ~/incoming/lists/lists.whatever.spool, then?
[1998-03-25 gnus.emacs.gnus, Marty Fouts <fouts A T null.net>] The point of the argument is: The RFCs don't demand what those who would quote them to suppress munging claim they do. In particular, RFC 1036 is advisory, an attempt to describe how netnews works with NNTP. In the case of header munging, RFC 1036 does not describe the way the software works in the field. There is no reason to cite an advisory RFC that in many ways is incorrect to support an untenable position.
Note: Marty is an I ETF US EFOR and has a good unde rstanding how the RFCs should be inte rpre te d. S e e gnu.e macs.gnus 199902-08 and the re ad / Re : "S e nde r" fie ld/.
[1997-11-05 gnus.emacs.gnus, Marty Fouts] No RFC forces the address of the poster to be a reachable address (indeed, Sender: is sometimes user@host without the domain part) it only requires such addresses to be syntactically correct. The RFCs do not require anything. The RFCs related to Usenet are advisory. RFCs describe various things and define a small number of standard protocols, netnews is not an internet standard protocol.)
Not all RFCs are standards RFC 1036 specifically states that it is not an internet standard. The wording of RFC 1036 and 822 WRT to the RFC 1036 header is ambiguous. RFC 822 specifically describes the format of a mail message. It does not describe the complete format of an electronic mail address. Nowhere in 1036 is there language requiring that the address be deliverable to. Further, 822 provides language that would allow for a valid but not deliverable address to be acceptable. [822 doesn't describe addresses, it describes mailboxes, which are something similar but not identical.] The bottom line WRT RFCs that are informational is that when there is an ambiguity, or a difference between the RFC and the implementation, the implementation (which is what the RFC was trying to describe in the first place) has precedent. As much as y'all want it to be otherwise, the implementation of netnews, (I. E. INND, NNTP) doesn't care about whether or not an address can be replied to. It is rumoured that some news posting software checks the validity of an address. Such software is in a tiny minority.
[counte r argume nt 1998-03-25 gnu.e macs.gnus, Jan Vroonhof <vroonhof A T fre ge .math.e thz.ch>] Now although I NND and frie nds are important parts of the Use ne t software bundle the ne ws READERS are e ve n more important. Now I 'll be t 99% re ade rs, like f.i. Gnus, assume the addre ss in the he ade r is the addre ss to be re plye d to whe n the use r re que sts to go into a private discussion with the author (i.e . re ply inste ad of followup).
[marty] netnews is a public forum. mail is a private communication medium. Posting in a public forum does not require that I give you access to my private address, just as speaking at a public meeting does not require that I give you my unlisted phone number. One thing is for certain: putting the burden on anyone wishing to send an mail to you, by requiring them to decipher the address. Someone may never "reply by mail" to persons using those phony addresses. Anyone who wishes to send a personal mail cannot just hit 'reply'. People who do this accept this, which is they will watch the newsgroups for followups regularly. If someone eagerly wants to get personal, he can spend the extra minute to decipher the correct address for the person. --Marty
[counte r argume nt, vroonhof] Howe ve r if you don't want to give me your phone numbe r, why give me a false one ? I f pe ople with this de sire at le ast put only the ir name and had no "<adre ss>" part the n one could have the ne ws re ade r say "Re ply impossible , no addre ss give n". [Counte r argume nt, unknown] Whe n I was using Pe gasus Mail (Win95), it took me about 10 minute s to se t up filte rs that re move d ove r 7 5% of the spam I re ce ive d. 10 minute s is too gre at a burde n to you? MY, what a busy pe rson you are .
[timothy] What about the accounts from which I do not control (network at work) where I do not have say over what software is installed? I can say to the sysadmin ``Hey I'd like Pegasus mail installed'' and he nods and mumbles something. He's got 2 years worth of backlog from there not being a real sysadmin around
[Counte r argume nt, unknown] Furthe rmore , the re are a numbe r of procmail re cipe s available on the ne t, that can be use d with minor adjustme nts to filte r your mail. No he avy-duty Unix skills are re quire d. Just the initiative to take re sponsibility for your own proble ms.
I know procmail very well, and spammers are still getting through. You know why? They refuse to follow all the conventions we depend on. And they spam mailing lists, so I have to filter for that as well. I have spent untold hours trying to develop better and better filters with lower numbers of mishits. Nothing works as well as not giving more spammers my address.
...You simply pre fe r to put the proble m off on some body e lse , rathe r than take the time to de al with it yourse lf. We ll, that kind of lazine ss doe s se e m to pre dominate in the "world of the I nte rne t" the se days.
I have spent the time, learning from what others have done and seeking to improve them. You are certain you are right and refuse to think about it anymore.... and that kind of laziness is all over the Internet. The only one it wrongly inconveniences are those who need to mail me and have lost my mail address. If you want to followup a Usenet post, do it in Usenet. I'll be back here for followups. I get enough mail, and don't need mail for Usenet threads. If you would like me to use a real address, please set me up an account with procmail where I can get all my Usenet related messages sent. --Timothy
...I am well aware that it is bad behavior, as I am well aware that it breaks standards. However, I'm also well aware of the fact that I do not need to have a mail-box filled with spam every time I look at it. Things have quieted down considerably since I started altering my From: line. There's still the occasional that gets me, though. It's not really such a big deal right now, but after following the netabuse newsgroups for a while, it has become apparent to me that spammers are trying new tactics to grab mail addresses (msg id's, sender: lines, etc...). Since I have to download most of my mail from a POP3 account, it takes time that I don't have to wait for all that spam to download. If breaking my headers means getting a few moments peace and freedom from spam, then so be it. [M. Maxwell <drwho A T No- Spam - see.sig> 1998- 03-26 gnus.em acs.gnus] ...Believe me, I don't like having to do this header munging at all. But it saves me considerable aggravation. I also don't have to download my mail from a POP3 server (my ISP has a shell account), but I prefer to read mail offline simply because I get so much of it with all those mail lists, And since that's the case, I end up downloading plenty of junk along with the legit mail, after which, my local procmail puts it where it belongs. In other words, not in my inbox. And so I'll do what I have to to foil the spammers (until we get some sort of legislation passed on junk mail). And those that do get past the fouled headers are dealt with accordingly.
[elijah] Most any 7bit character. For all practical purposes whitespace (space, tab, newline) are really inadvisable. This post is from a valid address. I also have ones with control characters eg <@qz.to> (may not show up right in your newsreader). See RFC822 for the full rules on generating an address, but the quick and dirty thing is any of the "specials" must be quoted to be used.
If you don't believe me, there are mail toys to prove this. Best one I know of right now is Tom Phoenix's "fred&barney"@example.com address. You can replace the "&" with just about any string I believe. I've tried it with stuff like "fred($)barney"@example.com and it seems pretty stable.
>>>>> In article <[email protected]>, >>>>> Rich Pieri <em>[email protected]</em> enscribed: > -----BEGIN PGP SIGNED MESSAGE----> Marty Fouts writes: >> >> >> >> >> Sort of: system-name is not a hook into gethostbyname. The /variable/ system-name is set by a builtin defvar to gethostbyname. system-name returns the value of the /variable/ system-name, and the emacs lisp manual advises setting it if it is not correct.
> It still uses gethostbyname() to set the initial value. > gethostbyname() is supposed to return an fqdn on a networked
> host.
So? That the initial value is an FQDN is no indication that the value returned at any time thereafter will be. This is why emacs doesn't use system-name to create mail addresses, but has a separate function. If emacs itself doesn't rely on system-name to generate any mail addresses, why should gnus?
user@fqdn is the agent responsible for submission of a message to the network. user@fqdn is the RFC sender of the message. user@fqdn therefore must be made to be a valid mailbox.
>> This is just flat out wrong. There is no such requirement in >> any RFC or implied by any combination of RFCs. > Premise: Gnus is used interactively. Premise: "user" > (user-login-name) is the login name of the person using Gnus.
And that's where you fail first. There is no requirement anywhere in any RFC or combination of RFCs that a login name even exist. Although your premise is true, it is irrelevant to your conclusion, as explained below.
> Premise: "fqdn" (system-name, self-referential gethostbyname) is > the canonical network host name of the machine "user" is using at > the time.
And that's where you fail second. There is no requirement anywhere in any RFC or combination of RFCs that the machine "user" is using be exposed as a part of a mailbox. I am /allowed/ to do that, and if I do that I am required to support that mailbox as valid. I am not /required/ to do that. I've already cited, and will repeat, that a TIP is a good example of such a machine. So is a POP3 client. You are missing some more premises, most notably that user@fqdn is the sender of the message in the sense of any RFC or combination of RFCs. Most importantly, you are missing some steps in your logic. You have not established that the /sender/ field's mailbox has to be the one you would construct from user-login-name@system-name, even on a system where such a combination formed a valid mailbox. You have not established that user-login-name@system-name be required to form a valid mailbox, even if the system has the concept of a login-name and both user-login-name and system-name return what you expect them to. Nor will you be able to, because there are no such requirements. There is /no/ requirement /anywhere/ in any combination of RFCs that it be possible to construct a mailbox from the combination of a "login-name" of any sort and an FQDN. There is /no/ requirement /anywhere/ in any combination of RFCs that a "login-name" even exist. There is /no/ definition /anywhere/ in any combination of RFCs for the concept of a "loginname". To put this as simply as possible:
You are incorre ct to asse rt that the re is any re quire me nt that a syste m support the mapping from (login-name ,FQDN) to a mailbox of the form login-name @FQDN.
Once you understand that this assertion is incorrect, it should be easy to see that all assertions
Note that the comment "is inexcusable" is my opinion. The draft, contrary to your apparent understanding, merely gives -guidelines- for how to use mime headers. If you, or anyone else, feels that the draft replacement for RFC 1036 needs to be worded differently, you are welcome to join the task force and attempt to persuade the members of this. However, a warning is in order: the process has been ongoing for several years, deadlines approach, and this particular issue has been argued in a great deal of detail.
The Internet Message : Closing th e Book With Electronic Mail (out of print) by Marshall T. Rose Prentice Hall (Sd) ISBN: 0130929417 Managing Mailing Lists: Majordomo, LISTSERV, Listproc, and SmartList By Alan Schwartz O'Reilly & Assoc. 1st Edition March 1998 ISBN: 1-56592-259-X 298 pages, $29.95 http://oreilly.com/catalog/mailing/ sendmail, 2nd Edition By Bryan Costales & Eric Allman O'Reilly & Assoc. 2nd Edition January 1997 ISBN: 1-56592-222-0 1050 pages, $39.95 <http://oreilly.com/>
sender -> MUA -> MTA ->..-> MTA -> MDA ->{maildrop}-> MUA -> reader [1] [2] [3] [4] [5] [6]
Headers typically provided by "template" by the MUA to the sender, usually during stage [1] (when composing e-mail):
From: To: Cc: Bcc: Subject: Reply-To: Priority: Precedence: Resent-To: Resent-Cc: X-BlahBlah:
# # # # # #
who I am the target people to keep informed, but need not respond secret admirers what's the mail about highest priority return address
When the sender is done composing, and says "send it" to his/her mailer, some additional headers may get inserted by the MUA at this stage [2]:
Date: Resent-Date: # if being redirected From: # If not already present Sender: # if a From: is already present X-Mailer: # what MUA composed this message Mime-Version: Content-Type: # what kind of stuff is in here Content-Transfer-Encoding: Content-Length:
When the MTA receives the e-mail from the MUA at stage [3], it may insert additional headers showing the origination of the e-mail:
# # # # # #
if local e-mail, automatic or by -f option If not already present unique ID for the e-mail; the first MTA creates this shows inter-system e-mail tracking info shows how to get back to the sender
As each MTA hands off the e-mail, additional headers may get added, all as part of the MTA to MTA handoff in stage [3]:
Received:
As the final MTA hands the e-mail to a delivery agent (MDA), in stage [4], there are still some more header insertions which may occur:
Apparently-To: From
Some sites insert special rewrite rules and filtering to occur to support virtual domains, and these header changes will occur at stage [5], just before the incoming mail is dropped. Generally, though, no new headers are added, except possibly one to avoid loops:
Finally, at stage [6] when the reader views his/her e-mail, most MUAs will apply a filter to the stored mail causing selected headers to be omitted from the display. In a sense, then, this filtering "removes" the headers from the user's view (although no headers are actually removed by the MUA). The headers typically omitted are those inserted by the MTAs, and those having to do with the transport process and less with the contents.
unless one is already present and only then if it seems valid. The second From: is generated by the MUA (your personal mailer), either by configuration, or by the user. The rewrite rules in sendmail and most filtering programs concern themselves with the From:, To:, Cc:, Reply-To: headers. I'll assume that if "From smmi" is not "correct", then you must be trying to hide the delivery process, and implementing something of a virtual domain. In general, it is a bad idea to "correct" the automatic mail headers inserted by the MTAs. This is a different matter than changing addresses to show virtual domains. The From_ header is part of the history of the message, showing how the mail was originated. Similarly, the "Received:" headers should not be messed with. Changing the history of an e-mail message will make it very difficult to diagnose e-mail delivery errors. That being said, and, since I also believe in the freedom of choice, I will now supply you with "enough rope to hang yourself" :^) There are two places where you can have the From_ header corrected: just before it gets dropped into the mailbox (for incoming e-mail), or as it gets submitted to the MTA (for outgoing e-mail). Changing the From_ before it gets dropped is easy. Just use a recipe like this:
FROM DATE
$FROM $DATE"
The From_ header is created automatically by the MTA (sendmail) when it receives a piece of mail. If the mail is sent through sendmail without using the '-f' option, then sendmail sets the default From_ to that of the current user. If you are not root, or a "trusted user" (see the sendmail man page), then sendmail will ignore the From_ header and either remove it altogether or replace it. Even if you are root, sendmail will replace the From_, if the e-mail is being received locally (as opposed to from the network). If you wish to change the From_, you must invoke sendmail, as root or a "trusted user", and use the "-f" option. EG: to set the From_ to match the From: header, use the following recipe, as root:
Please read the man page on sendmail, noting the use of '-f'.
different ways: Many MUAs format outgoing mail without the Bcc: headers, so that the same message header can be sent to all recipients. The Bcc: recipients receive an extra line in the message body, indicating the nature of the mail. The text of the message varies from MUA to MUA; The Rand Mailer, MH, for example inserts the lines around the original text:
Some MUAs will send the message, separately, to each Bcc: recipient, with the recipient address on the Bcc: header. Each Bcc recipient thus knows that they received the message by way of the Bcc, but do not know whom else was a Bcc recipient. All Bcc recipients are private, even to other Bcc recipients. (It would be nice if all MUAs behaved this way). A few MUAs deliver the message without the Bcc:, but also without any special indication; you must guess that it was a Bcc. The original mail standard RFC822 says this about Bcc:
4.5.3. BCC / RES ENT-BCC This fie ld contains the ide ntity of additional re cipie nts of the me ssage . The conte nts of this fie ld are not include d in copie s of the me ssage se nt to the primary and se condary re cipie nts. S ome syste ms may choose to include the te xt of the "Bcc" fie ld only in the author(s)'s copy, while othe rs may also include it in the te xt se nt to all those indicate d in the "Bcc" list.
So, procmail would handle Bcc's correctly if the sender's MUA included the Bcc in the header in the first place. But, since procmail is most typically used on incoming mail, it will never have a chance to deal with Bcc: headers.
[alan] Procmail mailing list 1996-11 Only the MTA knows the destination address because it is part of the "envelope", the information which is passed on the "RCPT To: some-user" SMTP line. This information is how the MTA knows to deliver the mail, and not by the contents of the headers. Of course, when invoked properly, many MTAs can read the headers to obtain the addresses needed on subsequent "RCPT" commands in the ensuing SMTP connections. In fact, the Bcc: header can be read along with the rest of the destination headers to obtain the recipient addresses, but the Bcc:
will also be removed from the headers. The address by which an MTA receives a mail is known as the "envelope address", which may be distinct from any headers in the message itself, or, the same as one of them, for directly addressed mail. With mailing lists, for example, the addressee will never see his/her own address, but will see the mailing list in the To: or Cc: header fields. Even here, when mail is addressed to more than one mailing list, there is a lack of standard for determining the address by which a message is received. There are lots of conventions followed, and heuristics, but no clearly defined standard to indicate the cause of delivery. You may be able to configure your MTA to pass along the envelope in a new header, or pass it by argument to the local delivery program (which can be procmail). It is then up to the local delivery program to use (or not) the envelope address information. If you wish to understand the limits of your mail system, you should read RFC822 (mail formatting standards) and RFC821, which describes the original language of SMTP. There are several extensions in progress, but the basic commands of "MAIL", "RCPT", and "DATA" should suffice.
[1998-05-31 FAQ-L Simon Lyall <simon A T darkmere.gen.nz>] Both forms are legit but the way news and standards documents are going is for the first form to be discouraged. This efectively means that software should accept both forms but only generate the second (this is when the article is first created not by someone half way around the world). The problem with the first form is that stuff in brackets is actually a "comment" rather than the name of the poster. This means that there is no way using the first form to actually say what your name is, it is just that most people say their name in the comment field. They could just as easily say something else. This means that software that displays the comment field as th name is just taking a guess. The 2nd format puts the name of the posted in a definite place that software can work with and allows you to leave the use of brackets for comments. The current internet draft that on this that will most likely replace RFC 822 on this point is at: http://tools.ietf.org/html/draft-ietf-drums-msg-fmt-04 The bit is section 3.4 which says:
Note : S ome le gacy imple me ntations use d the simple form whe re the addr-spe c appe ars without the angle bracke ts, but include d the name of the re cipie nt in pare nthe se s as a comme nt following the addr-spe c. S ince the me aning of the information in a comme nt is unspe cifie d, imple me ntations S HOULD use the full name -addr form of the mailbox if a name of the re cipie nt is be ing use d inste ad of the le gacy form. Also, be cause some le gacy imple me ntations inte rpre t the comme nt, comme nts S HOULD NOT ge ne rally be use d in addre ss fie lds to avoid confusion.
Filtering out incoming messages (pre-retrieval via POP3) seems 'fairly' safe, though some legitimate mail may include this header. Using it as a heavy weight (but not enough on its own) in a procmail scoring recipe that detects spam appears to be reasonable. [philip] If a message comes into your mailbox that has the X-UIDL: header, and doesn't have your address in the header, then I would have strong doubts about its legitimacy. [ed] comments: E-mails with X-UIDL: headers are almost definitely spam unless they've been Resent-To: me by someone. Also, valid X-UIDL: headers have 32 hexadecimal digits exactly. [David] The advisability of trashing all mail with X-UIDL: headers has been discussed on procmail list recently; apparently it's possible for one to appear in legitimate mail. [Elijah] Yup. Very true. Mostly likely case would probably be for certain types of forwarded mail, including some moderated mailing lists. Fluffy's mod.* list had these until I pointed out the widespread file-to-/dev/null problem to Fluffy.
...Are the re known proble ms with "valid" mails with ille gal Me ssage I Ds? For some strange re ason, some pe ople are se nding out mail with bad me ssage id's. That wouldn't be much of a proble m, e xce pt that our MI TS de partme nt won't e ve n conside r fixing the bad-me ssage -id unle ss it cause s a proble m some whe re e lse .
Why would they not consider fixing it? Their e-mail software/gateway is broken, and needs fixing. That's that. Direct them to RFC 822, sec 4.6.1. http://www.ietf.org/rfc/rfc0822.txt?number=822 [Gerald Oskoboiny <gerald A T impressive.net>] There are problems with Some of the problems with mail containing a bad message id Some people (myself included) run filters to automatically delete incoming e-mail if its message-ID has been seen recently, or if it looks bogus. Some mailing list software (including Smartlist) does not accept e-mail with a message-ID that has been seen recently. Each message must have a unique message-ID. The best way to ensure that msgids are unique in a global context is to include a fully-qualified domain name after the '@'. In particular, a message-ID like <3.0.5.32.19971208192547.007db100@mailhub > is unacceptable for this reason (even if it didn't have a space at the end.) Some mail archive software (including some that I wrote) uses message-IDs as a unique identifier for that message in the archive. It may reject messages that appear to be duplicates because they have a message-ID used by other messages. (as my software does.) [generating message id] [Stainless Steel Rat <ratinox A T peorth.gweep.net> 1998-03-13 in Emacs Gnus mailing list] ...it is strongly recommended that Message-Id strings be generated by the MUA, rather than the MTA. The reason being that a mail hub could be processing several messages at the same time (multiple CPUs), and so could accidentally generate duplicate Message-Id strings. The MTA should generate Message-Id headers only when the MUA is stupid and fails to do it. [phil 1998-03-19 PM-L] ... let's do a quick work-up of a 'more complete' regexp to match Message-Ids. I'll take syntax lines from rfc822 with regexps that should match them. For ease of presentation, I'm going to work from the bottom up. Note: any brackets that only contain whitespace should really contain a space and a tab.
= = = = = = =
'"' # (literal) double-quote "\\" # (literal) backwhack "[ ]*" # whitespace "[-!#-'*+/-9=?A-Z^-~]+" "($atom|$dq([^$dq\]|$bw.)*$dq)' "$word($ws\.$ws$word)*" "(\[$ws([^][\]|$bw.)*$ws\]|$atom($ws\.$ws$atom)*)"
...I did start logging ids that match that condition. I t matche d two me ssage s so far. One me ssage -id was cle arly bogus, but he re 's the othe r one (mailing list with 1 msg/we e k, no spam):
Message-Id: <[email protected].>
I s your re ge xp incomple te wrt trailing dot in the domain part, or is the MUA/MTA broke n?
[philip] rfc822 doesn't allow a trailin g dot. I just looked at the draft of the new Internet Message Header Standard (the eventual replacement for rfc822) and it doesn't either. Rather, it further restricts the syntax of generated Message-Id headers to disallow comments or folding whitespace from occuring in the message-id itself.
however: before you go tightening that regexp, note that the standard requires that programs that process messages must accept and parse messages that fit the obsolete syntax. This is because old mail messages can hang around for long periods of time in a way that most other internet data formats don't see. The new requirements are on the generation of new messages, not on old messages. [1998- 10- 22 com p.em acs Toby Speight <Toby.Speight A T digitivity .com >] It's more usual (and useful) to refer to news articles by Message-ID (that's what Message-IDs are for!). In this case
<news://[email protected]>
(though for some reason this returns text/plain for something which is clearly a message/rfc822). Either of which is an unambiguous URL, not subject to the same time-dependent changes. URLs were designed exactly to remove the need for such descriptions.
])*$
[Christopher Lindsey <lindsey A T ncsa.uiuc.edu>] No guarantees here. I just tried it out on some test mailboxes (all known to have valid mail), and it matched like mad. As far as I can tell, there's no requirement in RFC 822 for multiple lines in a Received header. [Reto Lichtensteiger <rali A T meitca.com>] The one line header vs. multi-line header is config'ed in sendmail: An older cf file (V8.7):
HReceived: $?sfrom $s $.$?_($?s$|from $.$_) \ $.by $j ($v/$Z)$?r with $r$. id $i$?u for $u$.; $b
HReceived: $?sfrom $s $.$?_($?s$|from $.$_) $.by $j ($v/$Z)$?r with $r$. id $i$?u for $u; $|; $.$b
24.6 Return-Path
...I 've cre ate d a use r (lo_maile r) with a .forward and a procmailrc file to transport incoming mail to the right use r. That is working fine , but the Re turn-Path: Line is se t to the local procmail use r (lo_maile r) and doe s not contain the original Re turn-Path! What can I do to win back the original-line ? Ple ase he lp me :)
[david] Normally when you forward mail you should NOT keep the original return path. If the forwarding destination is invalid or unreachable, mail has to be returned to the forwarder, who can fix the forwarding routine, not to the original sender, who can't do anything about it and probably never even heard of the final destination address. But, though you should change the return path, you do not have to lose the information that the
original return path contained. You can safely put that into the body or into another header line. Try this in lo_mailer's .procmailrc:
:0fwh # if there's a return path, save it as Old-Return-Path: * ^Return-Path:.*<.+> | formail -iReturn-Path: # lower-case i :0Efwh # if there's no return path but there is a From_, use that * ^^From[ ]+\/[^ ]+ | formail -A "Old-Return-Path: $MATCH" :0Efwh # if there was neither a Return-Path: nor a From_ | formail -A "Old-Return-Path: unknown"
The first set of brackets in the condition line of the second recipe enclose a space and a tab; the second set enclose caret, space, tab. On the forwarding leg from lo_mailer to the final recipient, the return path will be to lo_mailer, as it should, but if the final recipient wants to know where it originated, he or she can look at the OldReturn-Path header. There is one caution here. If lo_mailer is taking mail to a general response address and distributing it to specific people based on subject or body content or just by rotation to balance the workload, fine. But if you have a personal domain and your ISP is routing all mail for any address in your domain to your account on the ISP, and you're depending on procmail to deliver it to the right address in your own domain by reading To: or Cc: headers, that is the wrong approach. The correct recipient will be on the envelope, which is removed from incoming mail before procmail can see it. Your ISP has to do something that lets you know the true envelope recipient or recipients of a message, and others here know a lot more about that than I do (and way, way more than I could tell you without making mistakes). [1998-11-11 Gnus-L Karl Kleinpaste] With regard to the standards for Return-Path, RFC822 observes that it should be a route back to the originator, i.e., it should show relay hops; RFC1123 in turn says that failure notifications should be sent back to the originator with the route information deleted, that is, "If the address is an explicit source route, it SHOULD be stripped down to its final hop." ??? Then what's the point of providing the source route in the first place? It seems to me that Return-Path's value has become very limited in an environment where sourcerouted mail is vastly deprecated, and just plain not supported by many. I know that, when I did serious sendmail work years ago, I shot all source routes on sight. You could very well substitute the use of user-login-name for the "-f" argument in sendmail with the value user-mail-address; the result should give the effect you need, and not create any interoperability problems mail will still show a proper way to return to you. That said, this mailing list's requirement of matching Return-Path is indeed pretty peculiar.
24.7 Errors-To
1) Can some body confirm that Errors-To: is de pre cate d? 2) I s the re an RFC for this?
[1998-09-15 Liviu Daia <daia A T stoilow.imar.ro>] 1) It is an UUCP thing, and it's indeed deprecated. Here's the relevant quote from sendmail's manual. 2) Probably not, since UUCP-related RFCs haven't been updated in a while.
I f e rrors occur anywhe re during proce ssing, this he ade r will cause e rror me ssage s to go to the liste d addre sse s. This is inte nde d for mailing lists. The Errors-To: he ade r is officially de pre cate d and will go away in a future re le ase .
24.8 X-Subscription-Info
This is a header that is used by some mailing lists: it contains an mail address for un/subscribe, or a URL with said info. Imagine the reduction in bozo messages asking how to unsubscribe from mailing
lists. If your mailing list doesn't have it already, make a suggestion to the list's maintainer.
It turns out that we already have most of this in RFC 822: The 'phrase' before an address, or a comment, can identify a person by name and/or role. The responder can use this information to decide whether it's reasonable to send a reply to that person. e.g.
Similarly, the 'phrase' after a group name can identify a group of recipients, which can also be used by the responder. e.g.
(Unfortunately, phrases are so widely botched, that they probably aren't usable for this.) Summary: The way to solve most reply problems is to encourage the responder to actually think about where the message needs to go, and make it easy for him to get the behavior he wants. (It also helps if people use the RFC 822 'phrase' to label their header addresses.) We can build interfaces that help the responder do this without defining any new header fields. Except for a very few cases, Mail-{Reply,Followup}-To doesn't help. It only provides more opportunities for surprising behavior. Stainless Steel Rat <ratinox A T peorth.gweep.net> 1998-02-12 commented in Emacs ding mailing list Every mail client is not doing supporting this. Only the badly written ones fail to distinguish between replies and followups. When you get right down to it, this proposed standard has two goals: To make broken MUAs act less brokenly. Well, broken MUAs are not going to implement this standard, anyway; good MUAs do not need it as they already make the distinction between replies and followups. To make broken mailing lists act less brokenly. Administrators of broken mailing lists have decided that they like it that way. They claim that it makes it easier for their lists' subscribers to reply to the list. The subscribers that "need" list-bound Reply-To headers are using broken MUAs. See #1. This proposed standard will not solve any of the problems it attempts to address. It creates headers that are ignored by bad MUAs and are redundant for good MUAs. To summarise Keith's statement: From is the originator's mailbox. It is not an 'account'. RFC 822 states that the originator header should contain the correct default reply address. This is the scenario that the proponents of these headers have proposed, and the flaw the IETF has found with it. Joe is subscribed to a mailing list that he reads from his "private" mail account. For whatever reason, Joe posts a message to that list from work, so his work mailbox is in the From header. Joe does not want to override where responses go with a Reply-To header, but he wants personal replies to go to his private mail account instead of his work account.
The flaw the IETF found is that Joe is equating his two mailboxes with his private and work accounts. There is no such correspondence as far as RFC 822 is concerned. If Joe is acting in a "private" fashion, the system he is using is irrelevant; his private mailbox belongs in the From header and he should put that mailbox there when he originates the message, regardless of where he physically is when he does so.
Mail-Copies-To: never
Do not automatically include the sender of the message being responded to. There are two canonical examples.
Newsgroups: comp.emacs.xemacs Email: From: [email protected] To: [email protected] Cc: [email protected] Mail-Copies-To: never
The second form includes a properly formed RFC822 mail address as the parameter:
Mail-Copies-To: [email protected]
In this case, the sender of the message is specifically requesting that responses to the message not only go to the main forum (either mailing list or Usenet newsgroup), but a duplicate copy should also be sent to [email protected]. There are (again) two canonical examples.
Newsgroups: comp.emacs.xemacs Cc: [email protected][1] Email: From: [email protected] To: [email protected] Cc: [email protected] Mail-Copies-To: [email protected]
There is no requirement that the address in Mail-Copies-To match the From address. Footnotes: [1] Or `To: [email protected]' [2] It is also acceptable to put [email protected] in the To: line.
If the content-length-based format was not otherwise- indistinguishable from the ``From '' format, there wouldn't be a problem; the old software would simply fail to work with this new file format, instead of `corrupting' the documents (in quotes, because it's really just a matter of which spec you're following.) Also, mailboxes are by their nature a textual format; but, the content-length header measures in bytes rather than lines. This means that if you move the file to a system which has a different endof-line representation (Windows <=> Mac, or Windows <=> Unix) then the content-lengths will suddenly be wrong, because the linebreaks now take two bytes instead of one, or vice versa. It's impossible for a mail client to look at a file, and tell which of the two formats (From_ or ContentLength) it is in; they are programmatically indistinguishable. The presence of a Content-Length header is not en ough, because suppose you were on a system which knew nothing at all about that header, and some incoming message just happened to have that header in it. Then that header would end up in your mailbox (because nobody would have known to remove or recalculate it), and it would possibly be incorrect. (Presume further that the header was not just incorrect, but intentionally malicious...) Stricter parsing of the ``From '' separator line doesn't help either, because there are many, many variations on what goes in that line (since it was never standardized either); and also, some mail readers include that line verbatim when forwarding messages (Sun's MailTool, for example) so a stricter parser wouldn't help that case at all, because message bodies tend to contain valid matches. Some mail readers attempt to cope with this by recognizing the case where the Content-Length is not obviously spot-on-target, and then searching forward and backward for the nearest message delimiter; but this is obviously not foolproof, and makes one's parser much more inefficient (requiring arbitrary lookahead and backtracking.) Conventional wisdom is, ``if you believe the Content-Length header, I've got a bridge to sell you.''
An clear non-existing mail address that indicates that it is not the real destination is usually considered good manners:
Or partially modified, that a human mind can "decode" if a direct contact is wanted (but somewhat hard to programs, because there are more creative choices that what program can ever expect to see):
A valid looking address But an address that looks like a "real", but is bogus, is not a polite way to participate in Usenet. This address wold give an impression that persn is really there:
License: This material may be distributed only subject to the terms and conditions set forth in GNU General Public License v2 or, at your option, any later version. This file has been automatically generated from plain text file with t2html Document author: Jari Aalto Last updated: 2011-12-09 11:25