1 Tools in the war on mail loops
2 D. J. Bernstein, djb@pobox.com
8 An automailer means any program that receives a mail message and
9 automatically sends one or more mail messages. This term is meant to
10 include not only a mail-based server, such as a mailing list exploder
11 or a vacation program, but also an SMTP server, which receives a
12 message from the network and relays it to a local or remote user.
14 In a network full of automailers, any mistake can cause a mail loop.
15 Since some automailers generate several outputs in response to a
16 single input, a loop can produce an exponential explosion of mail.
18 All the automailers in the qmail package follow a general philosophy
19 designed to prevent mail loops and limit the damage from any loops
20 that do occur. These automailers have been repeatedly observed to
21 fail safe: they stop loops in the face of typical failures by other
22 hosts. This document explains the philosophy and describes the
25 To some extent the philosophy here simply repeats and amplifies
26 standard practice as codified in RFC 974 and RFC 1123. Unfortunately,
27 the standards do not adequately control bounce loops, since they do
28 not recognize that postmasters want to see double bounces; they do
29 not adequately control relaying loops; and they do not prevent
30 cross-host forwarding loops.
32 Terminology: The mail message received by an automailer is called
33 input. The mail messages sent by an automailer are called outputs.
34 For simplicity, this document focuses on the case that the input has
35 just one envelope recipient.
37 REMINDER: This document describes the automailers in the qmail
38 package. Other packages include automailers that do not fit the
39 descriptions given here.
41 Beware that the war on mail loops can never be won: any method of
42 preventing mail loops can be subverted by other hosts. I welcome
43 further development of techniques that work well in practice.
48 The output from an automailer is always further down the following
51 0 hops, <sender> is neither <> nor <#@[]> normal messages
52 1 hop, <sender> is neither <> nor <#@[]>
53 2 hops, <sender> is neither <> nor <#@[]>
55 0 hops, <sender> is <> bounces
57 2 hops, <sender> is <>
59 0 hops, <sender> is <#@[]> double bounces
60 1 hop, <sender> is <#@[]>
61 2 hops, <sender> is <#@[]>
64 Here sender means the envelope sender address. Hops means the number
65 of Received and Delivered-To fields in the header. See sections 3.3
66 and 3.4 for an explanation of <> and <#@[]>.
68 Consequently, no automailer ever generates an entirely new normal
69 message in response to a normal message. If the output is a normal
70 message, it always has more hops than the input.
72 When input and output are both normal messages, both bounces, or both
73 double bounces, the output header is essentially the same as the
74 input header. However, when an automailer moves from a normal message
75 to a bounce, or from a bounce to a double bounce, it generates an
78 An automailer may refuse to operate if the input has too many hops.
79 The definition of too many hops depends on the automailer. This
80 practice is called hop counting. Note that some existing messages
81 legitimately take as many as 20 hops. One automailer uses a limit of
82 100 hops; this will be adequate for all messages in the foreseeable
85 Hop counting is a weapon of last resort. It will, if correctly
86 implemented, prevent all infinite loops; however, even a finite loop
87 can do practically infinite damage, as illustrated in section 4.3.
90 3. Pre-delivery automailers
92 Conceptually: The input is a message that has not yet reached its
93 envelope recipient address. It is fed to a relay, which attempts to
94 deliver the message directly to, or at least closer to, that address;
95 if the relay fails permanently, the message is fed to a bouncer or a
96 double-bouncer. Relays, bouncers, and double-bouncers are examples of
97 pre-delivery automailers.
99 A pre-delivery automailer produces at most one output.
101 The basic weapon against pre-delivery mail loops is gravity. A normal
102 message always moves closer to its envelope recipient, according to a
103 notion of distance defined in section 3.1. If it bounces before
104 reaching the recipient, it turns into a bounce message, which always
105 moves closer to the original envelope sender. If that in turn
106 bounces, it turns into a double bounce, which always moves closer to
107 a local postmaster. (Triple bounces do not exist.)
112 The distance from a DNS domain D to a recipient U@R is defined as
113 follows, when R has an MX list: the minimum preference of D in the
114 MX list, or 100000 if D does not appear in the list.
116 When R has no MX records, the distance from R to U@R is defined as 0,
117 and the distance from any other domain to U@R is defined as 100000.
119 Exception: If R is an alias, i.e., if R has a CNAME record, the
120 distance from any domain to U@R is defined as 500000.
122 The distance from a host H to U@R is defined as the minimum distance
123 to U@R from any domain that touches H. (``D touches H'' means ``D has
124 an A record listing one of H's IP addresses.'')
126 Exception: If H does not accept mail from the network, its distance
127 to any recipient is defined as 999999.
132 A relay is a pre-delivery automailer that sends the output towards
133 the envelope recipient. What this means for intra-host relays is not
134 discussed here. What this means for cross-host relays is the
135 following: if the relay is at host H, and it sends its output to host
136 T, then the distance from T to the output envelope recipient is
137 always smaller than the distance from H to the input envelope
140 The following facts guarantee that certain cross-host relay behavior
141 is safe. For proofs of these facts, see Appendix A.
143 Fact 1: If R is an alias for X, X is not an alias, D touches T,
144 and T accepts mail from the network, then the distance from T to
145 U@X is smaller than the distance from H to U@R.
147 Fact 2: If R is not an alias, R has no MX records, H is not
148 touched by R, T is touched by R, and T accepts mail from the
149 network, then T is closer to U@R than H is.
151 Fact 3: If R is not an alias, R has an MX record with domain X and
152 preference p, H is not touched by any of the domains in the MX
153 list for R with preference <= p, T is touched by X, and T accepts
154 mail from the network, then T is closer to U@R than H is.
156 Also, a host that does not accept mail from the network can relay
157 messages to a nearby hub.
159 A relay adds a new Received header field to the top of the output.
160 Other than this, the output header, body, and envelope are exactly
161 the same as the input header, body, and envelope. Exception: If the
162 input envelope recipient is U@R, R is an alias for X, and X is not
163 an alias, the output envelope recipient is U@X.
168 A bouncer is a pre-delivery automailer that lets the envelope sender
169 know what happened to a message. Most bouncers send failure notices.
170 Some bouncers, such as vacation servers and echo servers, send
173 In a bouncer's output, the envelope sender is <>, and the envelope
174 recipient is the input envelope sender. A bouncer refuses to operate
175 if the input envelope sender is <> or <#@[]>.
177 Some mailers on the Internet do not understand the <> convention. In
178 fact, some mailers will rewrite <> as <@host>. So any message with an
179 envelope recipient of <> or <@host> is discarded upon local delivery.
181 Unlike a relay, a bouncer produces output with a new header, not
182 simply a copy of the input header. For example:
184 (envelope) from <> to <djb@silverton.berkeley.edu>
185 Date: 2 Jan 1996 03:38:25 GMT
186 From: DELIVERY NOTICE SYSTEM <MAILER-DAEMON@heaven.af.mil>
187 To: djb@silverton.berkeley.edu
188 Subject: failure notice
190 However, the body of the bounce indicates the relevant input envelope
191 recipient, as well as the Message-ID of the input, if the input had a
192 Message-ID. The body of a failure notice includes a copy of the
193 entire input message.
198 A double-bouncer is a pre-delivery automailer that informs a local
199 postmaster of permanent failures to deliver bounce messages. Such
200 failures are generally caused by poorly configured hosts that produce
201 normal messages with faulty envelope sender addresses.
203 A double-bouncer refuses to operate unless the input envelope sender
204 is <>. The output envelope sender from a double-bouncer is <#@[]>;
205 note that <#@[]> cannot be used as an SMTP envelope sender under
206 RFC 821. The output envelope recipient is predetermined.
208 Note that double bounces are not suggested by RFC 1123. However,
209 faulty envelope sender addresses are usually configuration errors
210 that can and should be fixed. Some postmasters, faced with mail
211 software that throws away double bounces, resort to keeping copies of
212 all bounces; but single bounces are rarely the postmaster's problem.
215 4. Post-delivery automailers
217 Conceptually: The input is a message that has reached its envelope
218 recipient address. It is fed to a post-delivery automailer at that
221 The basic weapon against post-delivery loops is a new header field,
222 Delivered-To, tracing all the forwarders and mailing lists that a
223 message has been through. This field has the side benefit of making
224 it much easier for a user (or for a postmaster seeing a bounce) to
225 figure out the path that the message took. Delivered-To is similar to
226 RFC 1327's DL-Expansion-History, but (1) it omits the time stamp,
227 removing any need for parsing, and (2) it has a much better name.
230 4.1. Exploders and repliers
232 There are two basic types of post-delivery automailers: exploders,
233 where the output envelope recipients are predetermined; and repliers,
234 where there is just one output, with envelope recipient determined
237 Repliers normally determine the output envelope recipient as either
238 the input Reply-To header field, if it exists; or else the input
239 From header field, if it exists; or else the envelope sender. A
240 replier never produces an output to <> or <#@[]>.
242 Exploders are classified into mailing lists, where the output
243 envelope senders are predetermined, and forwarders, where every
244 output has envelope sender equal to the original envelope sender.
246 Exception: if the input envelope sender is <> or <#@[]>, then the
247 output envelope senders are equal to the input envelope sender, even
250 Note that, if the envelope sender of a mailing list with M bad
251 addresses is another exploder with E bad addresses, the local
252 postmaster will receive EM double bounces for each message to the
258 Every post-delivery automailer adds a new Delivered-To header field
259 to the top of each output.
261 The contents of the Delivered-To field are typically the address of
262 the automailer, i.e., the input envelope recipient, conventionally
263 without any quoting. The contents of the Delivered-To field are in
264 any case entirely predetermined. The automailer checks if exactly the
265 same Delivered-To field already appears in the header; if so, it
268 A post-delivery automailer preserves existing Delivered-To and
269 Received fields. In fact, a post-delivery automailer generally
270 preserves all header fields. The exceptions are limited to known
271 fields that are not used for loop detection and that must be removed
272 for correct operation. For example, a replier generally changes the
273 body of a message and thus should not preserve the SVR4
274 Content-Length field.
279 Aliases and mailing lists are highly dangerous, because they can
280 generate several outputs for each input.
282 Here is an extreme example. A user has three accounts, and wants any
283 message to any of the accounts to be delivered to all three. So he
284 forwards luser@host1 to luser@host2 and luser@host3, forwards
285 luser@host2 to luser@host1 and luser@host3, and forwards luser@host3
286 to luser@host1 and luser@host2.
288 Without Delivered-To, someone who sends a message to luser@host1 will
289 receive a practically infinite series of bounces. For example, with a
290 hop count limit of 50, the sender will receive 1125899906842624
293 If all the hosts, or two out of the three, support Delivered-To, the
294 message will bounce just a few times. If just one of the hosts
295 supports Delivered-To, it will be the unfortunate victim of a loop
296 between the other two hosts---although the total number of bounces
297 will drop from practically infinite down to a few hundred, with
298 typical hop count limits.
301 Appendix A. Proofs of correctness for MX handling
303 Section 3.2 states three facts about the notion of distance defined
304 in section 3.1. Here are mathematical proofs of those facts.
306 Symbols: D, E, R, and X are domains; H and T are hosts; p and q are
307 nonnegative integers. {} is the empty set.
309 Hypotheses: M(R), the ``MX list for R,'' is a set of pairs (p,D)
310 where p <= 65535. There is a set A of domains, called ``aliases.''
311 There is a relation D->H, called ``D touches H.'' There is a set N of
312 hosts, called ``hosts that accept mail from the network.''
314 Definitions: m(D,R) = min { p: p = 100000 or (p,D) in M(R) } when
315 M(R) is nonempty. When M(R) is empty, m(D,R) is 0 if D = R, 100000
316 otherwise. f(D,R) is defined as 500000 if R is in A, m(D,R)
317 otherwise; this is the ``distance from D to U@R,'' for any U. g(H,R)
318 is defined as min { f(D,R): D->H } if H is in N, 999999 otherwise;
319 this is the ``distance from H to U@R,'' for any U.
321 Fact 1 (generalized): If R is in A, X is not in A, D->T, and T is in
322 N, then g(T,X) < g(H,R). Proof: R is in A, so f(E,R) = 500000 for any
323 E; thus g(H,R) >= 500000. X is not in A, so f(D,X) = m(D,X) <=
324 100000; hence g(T,X) <= f(D,X) <= 100000 < g(H,R).
326 Fact 2: If R is not in A, M(R) = {}, R->T, T is in N, and not R->H,
327 then g(T,R) < g(H,R). Proof: f(R,R) = m(R,R) = 0 since R is not in A
328 and M(R) = {}. T is in N so g(T,R) <= f(R,R) = 0 so g(T,R) = 0.
329 Suppose that g(H,R) <= g(T,R). Then g(H,R) = 0, so f(D,R) = 0 for
330 some D with D->H, so m(D,R) = 0. But then D = R by definition of m,
331 so R->H. Contradiction. Thus g(T,R) < g(H,R).
333 Fact 3: If R is not in A, (p,X) is in M(R), X->T, T is in N, and
334 (q,D) is not in M(R) whenever D->H and q <= p, then g(T,R) < g(H,R).
335 Proof: First m(X,R) <= p. R is not in A, so f(X,R) = m(X,R). T is in
336 N, so g(T,R) <= f(X,R). Thus g(T,R) <= p. Suppose that g(H,R) <= p.
337 Then f(D,R) <= p for some D with D->H, so m(D,R) <= p. But then
338 (m(D,R),D) is in M(R). Contradiction. Thus g(T,R) <= p < g(H,R).