Upstream qmail 1.01
[qmail] / THOUGHTS
CommitLineData
2117e02e
MW
1Please note that this file is not called ``Internet Mail For Dummies.''
2It _records_ my thoughts on various issues. It does not _explain_ them.
3Paragraphs are not organized except by section. The required background
4varies wildly from one paragraph to the next.
5
6In this file, ``sendmail'' means Allman's creation; ``sendmail-clone''
7means the program in this package.
8
9
101. Security
11
12There are lots of interesting remote denial-of-service attacks on any
13mail system. A long-term solution is to insist on prepayment for
14unauthorized resource use. The tricky technical problem is to make the
15prepayment enforcement mechanism cheaper than the expected cost of the
16attacks. (For local denial-of-service attacks it's enough to be able to
17figure out which user is responsible.)
18
19qmail-send's log was originally designed for profiling. It subsequently
20sprouted some tracing features. However, there's no way to verify
21securely that a particular message came from a particular local user;
22how do you know the recipient is telling you the truth about the
23contents of the message? With QUEUE_EXTRA it'd be possible to record a
24one-way hash of each outgoing message, but a user who wants to send
25``bad'' mail can avoid qmail entirely.
26
27I originally decided on security grounds not to put qmail advertisements
28into SMTP responses: advertisements often act as version identifiers.
29But this problem went away when I found a stable qmail URL.
30
31As qmail grows in popularity, the mere knowledge that rcpthosts is so
32easily available will deter people from setting up unauthorized MXs.
33(I've never seen an unauthorized MX, but I can imagine that it would be
34rather annoying.) Note that, unlike the bat book checkcompat() kludge,
35rcpthosts doesn't interfere with mailing lists.
36
37qmail-start doesn't bother with tty dissociation. On some old machines
38this means that random people can send tty signals to the qmail daemons.
39That's a security flaw in the job control subsystem, not in qmail.
40
41The resolver library isn't too bloated (before 4.9.4, at least), but it
42uses stdio, which _is_ bloated. Reading /etc/resolv.conf costs lots of
43memory in each qmail-remote process. So it's tempting to incorporate a
44smaller resolver library into qmail. (Bonus: I'd avoid system-specific
45problems with old resolvers.) The problem is that I'd then be writing a
46fundamentally insecure library. I'd no longer be able to blame the BIND
47authors and vendors for the fact that attackers can easily use DNS to
48steal mail. Possible solution: replace dns.c with something that passes
49requests (reliably!) to a local daemon; call the original resolver
50library from that daemon.
51
52NFS is the primary enemy of security partitioning under UNIX. Here's the
53story. Sun knew from the start that NFS was completely insecure. It
54tried to hide that fact by disallowing root access over NFS. Intruders
55nevertheless broke into system after system, first obtaining bin access
56and then obtaining root access. Various people thus decided to compound
57Sun's error and build a wall between root and all other users: if all
58system files are owned by root, and if there are no security holes other
59than NFS, someone who breaks in via NFS won't be able to wipe out the
60operating system---he'll merely be able to wipe out all user files. This
61clueless policy means that, for example, all the qmail users have to be
62replaced by root. See what I mean by ``enemy''? ... Basic NFS comments:
63Aside from the cryptographic problem of having hosts communicate
64securely, it's obvious that there's an administrative problem of mapping
65client uids to server uids. If a host is secure and under your control,
66you shouldn't have to map anything. If a host is under someone else's
67control, you'll want to map his uids to one local account; it's his
68client's job to decide which of his users get to talk NFS in the first
69place. Sun's original map---root to nobody, everyone else left alone---
70is, as far as I can tell, always wrong.
71
72
732. Injecting mail locally (qmail-inject, sendmail-clone)
74
75RFC 822 section 3.4.9 prohibits certain visual effects in headers.
76qmail-inject doesn't waste the time to enforce this absurd restriction.
77If you will suffer from someone sending you ``flash mail,'' go find a
78better mail reader.
79
80qmail-inject's ``Cc: recipient list not shown: ;'' successfully stops
81sendmail from adding Apparently-To. Unfortunately, old versions of
82sendmail will append a host name. This wasn't fixed until sendmail 8.7.
83How many years has it been since RFC 822 came out?
84
85sendmail discards duplicate addresses. This has probably resulted in
86more lost and stolen mail over the years than the entire Chicago branch
87of the United States Postal Service. The qmail system delivers messages
88exactly as it's told to do. Along the same lines: qmail-inject is both
89unable and unwilling to support anything like sendmail's (default)
90nometoo option. Of course, a list manager could support nometoo.
91
92There should be a mechanism in qmail-inject that does for envelope
93recipients what Return-Path does for the envelope sender. Then
94qmail-inject -n could print the recipients.
95
96Should qmail-inject bounce messages with no recipients? Should there be
97an option for this? If it stays as is (accept the message), qmail-inject
98could at least avoid invoking qmail-queue.
99
100It is possible to extract non-unique Message-IDs out of qmail-inject.
101Here's how: stop qmail-inject before it gets to the third line of
102main(), then wait until the pids wrap around, then restart qmail-inject
103and blast the message through, then start another qmail-inject with the
104same pid in the same second. I'm not sure how to fix this. (Of course,
105the user could just type in his own non-unique Message-IDs.)
106
107The bat book says: ``Rules that hide hosts in a domain should be applied
108only to sender addresses.'' Recipient masquerading works fine with
109qmail. None of sendmail's pitfalls apply, basically because qmail has a
110straight paper path.
111
112I expect to receive some pressure to make up for the failings of MUA
113writers who don't understand the concept of reliability. (``Like, duh,
114you mean I was supposed to check the sendmail exit code?'')
115
116
1173. Receiving mail from the network (tcp-env, qmail-smtpd)
118
119RFC 1123 requires VRFY support, but says that it's okay if an
120implementation can be configured to not allow VRFY. qmail-smtpd doesn't
121allow VRFY. If you desperately want your SMTP server (i.e., inetd) to
122provide useful information for VRFY, just compile and install sendmail.
123Were the RFC 1123 writers aware of the as-if principle of interface
124specification? ... They say that VRFY and EXPN are important for
125tracking down cross-host mailing list loops. Catch up to the 1990s,
126guys: with Delivered-To, mailing list loops do absolutely no damage,
127_and_ one of the list administrators gets a bounce that shows exactly
128how the loop occurred. Solve the problem, not the symptom. ... There's a
129vastly superior alternative to EXPN. Hint: finger postmaster@ai.mit.edu.
130
131Should dns.c make special allowances for 127.0.0.1/localhost?
132
133badmailfrom (like 8BITMIME) is a waste of code space.
134
135
1364. Adding messages to the queue (qmail-queue)
137
138Should qmail-queue try to make sure enough disk space is free in
139advance? When qmail-queue is invoked by qmail-local or (with ESMTP)
140qmail-smtpd or qmail-qmtpd, it could be told a size in advance. I wish
141UNIX had an atomic allocate-disk-space routine...
142
143The qmail.h interface (reflecting the qmail-queue interface, which in
144turn reflects the current queue file structure) is constitutionally
145incapable of handling an address that contains a 0 byte. I can't imagine
146that this will be a problem.
147
148Should qmail-queue not bother queueing a message with no recipients?
149
150
1515. Handling queued mail (qmail-send, qmail-clean)
152
153The queue directory must be local. Mounting it over NFS is extremely
154dangerous---not that this stops people from running sendmail that way!
155Perhaps it is worth putting together a diskless-host qmail package with
156just qmail-inject and an SMTP client in place of qmail-queue. Sending
157mail to the server via SMTP is of course vastly better than trying to do
158anything over NFS. If the NFS server is up but the mail server is down,
159users will just have to wait.
160
161Queue reliability demands that single-byte writes be atomic. This is
162true for a fixed-block filesystem such as UFS, and for a logging
163filesystem such as LFS.
164
165qmail-send uses 8 bytes of memory per queued message. Double that for
166reallocation. (Fix: use a small forest of heaps; i.e., keep several
167prioqs.) Double again for buddy malloc()s. (Fix: be clever about the
168heap sizes.) 32 bytes is worrisome, but not devastating. Even on my
169disk-heavy memory-light machine, I'd run out of inodes long before
170running out of memory.
171
172Some mail systems organize the queue by host. This is pointless as a
173means of splitting up the queue directory. The real issue is what to do
174when you suddenly find out that a host is up. For local SLIP/PPP links
175you know in advance which hosts need this treatment, so you can handle
176them with virtualdomains and serialmail.
177
178For the old queue structure I implemented recipient list compression:
179if mail goes out to a giant mailing list, and most of the recipients are
180delivered, make a new, compressed, todo list. But this really isn't
181worth the effort: it saves only a tiny bit of CPU time.
182
183qmail-send doesn't have any notions of precedence, priority, fairness,
184importance, etc. It handles the queue in first-seen-first-served order.
185One could put a lot of work into doing something different, but that
186work would be a waste: given the triggering mechanism and qmail's
187deferral strategy, it is exceedingly rare for the queue to contain more
188than one deliverable message at any given moment.
189
190Exception: Even with all the concurrency tricks, qmail-send can end up
191spending a few minutes on a mailing list with thousands of remote
192entries. A user might send a new message to a remote address in the
193meantime. Perhaps qmail-send should limit its time per message to,
194say, thirty recipients. This will require some way to mark recipients
195who were already done on this pass. Possible approach: Maintain two todo
196lists (for both L and R). Always work on the earlier todo list. Move
197deferrals to the other todo list.
198
199qmail-send will never start a pass for a job that it already has. This
200means that, if one delivery takes longer than the retry interval, the
201next pass will be delayed. I implemented the opposite strategy for the
202old queue structure. Some hassles: mark() had to understand how job
203input was buffered; every new delivery had to check whether the same
204mpos in the same message was already being done.
205
206Some things that qmail-send does synchronously: queueing a bounce
207message; doing a cleanup via qmail-clean; classifying and rewriting all
208the addresses in a new message. As usual, making these asynchronous
209would require some housekeeping, but could speed things up a bit.
210(Making bounces asynchronous, without POSIX waitpid(), means that
211wait_pid() has to keep a buffer of previous wait()s. Ugh.)
212
213fsync() is a bottleneck. To make this asynchronous would require gobs of
214dedicated output processes whose only purpose in life is to watch data
215get written to the disk. Inconceivable! (``You keep using that word. I
216do not think that word means what you think it means.'')
217
218On the other hand, I could survive without fsync()ing the local and
219remote and info files as long as I don't unlink todo. This would require
220redefining the queue states. I need to see how much speed can be gained.
221
222Currently qmail-send sends at most one bounce message for each incoming
223message. This means that the sender doesn't get flooded with copies of
224his own message. On the other hand, a single slow address can hold up
225bounces for a bunch of fast addresses. It would be easy to call
226injectbounce() more often. What is the best strategy? This feels like
227the TCP-buffering issue... don't want to pepper the other guy with
228little packets, but do want to get the data across.
229
230qmail-stop implementation: setuid to UID_SEND; kill -TERM -1. Given how
231simple this is, I'm not inclined to set up some tricky locking solution
232where qmail-send records its pid etc. But I just know that, if I provide
233this qmail-stop program, someone will screw himself by making another
234uid the same as UID_SEND, or making UID_SEND be root, or whatever.
235Aargh. Maybe use another named pipe... New solution: Run qmail-start
236under an external service controller---it runs in the foreground now.
237
238Bounce messages could include more statistical information in the first
239paragraph: when I received the message, how many recipients I was
240supposed to handle, how many I successfully dealt with, how many I
241already told you about, how many are still in the queue. Have to
242emphasize that the number of recipients _here_ is perhaps less than the
243number of recipients on the original message.
244
245The readdir() interface hides I/O errors. Lower-level interfaces would
246lead me into a thicket of portability problems. I'm really not sure what
247to do about this. Of course, a hard I/O error means that mail is toast,
248but a soft I/O error shouldn't cause any trouble.
249
250job_open() or pass_dochan() could be paranoid about the same id,channel
251already being open; but, since messdone() is so paranoid, the worst
252possible effect of a bug along these lines would be double delivery.
253
254Mathematical amusement: The optimal retry schedule is essentially,
255though not exactly, independent of the actual distribution of message
256delay times. What really matters is how much cost you assign to retries
257and to particular increases in latency. qmail's current quadratic retry
258schedule says that an hour-long delay in a day-old message is worth the
259same as a ten-minute delay in an hour-old message; this doesn't seem so
260unreasonable.
261
262Insider information: AOL retries their messages every five minutes for
263three days straight. Hmmm.
264
265
2666. Sending mail through the network (qmail-rspawn, qmail-remote)
267
268Are there any hosts, anywhere, whose mailers are bogged down by huge
269messages to multiple recipients at a single host? For typical hosts,
270multiple RCPTs per SMTP aren't an ``efficiency feature''; they're a
271_slowness_ feature. Separate SMTP transactions have much lower latency.
272
273The multiple-RCPT bandwidth gain _might_ be noticeable for a machine
274that sends most messages to a smarthost. It would be easy to have
275qmail-rspawn supply qmail-remote with all the addresses at once, as long
276as qmail-send says when it's about to block... Putting recipients into
277the right order is clearly the UA's job. One multiple-RCPT pitfall is
278that a remote host might not be able to deal with (say) 10 recipients,
279even though RFC 821 says everyone has to be able to handle 100;
280qmail-rspawn would have to notice this and back off. (Not that other
281mailers do. Sometimes I'm amazed Internet mail works at all.)
282
283In the opposite direction: It's tempting to remove the @host part of the
284qmail-remote recip argument. Or at least avoid double-dns_cname.
285
286There are lots of reasons that qmail-rspawn should take a more active
287role in qmail-remote's activities. It should call separate programs to
288do (1) MX lookups, (2) SMTP connections, (3) QMTP connections.
289
290I bounce ambiguous MXs. (An ``ambiguous MX'' is a best-preference MX
291record sending me mail for a host that I don't recognize as local.)
292Automatically treating ambiguous MXs as local is incompatible with my
293design decision to keep local delivery working when the network goes
294down. It puts more faith in DNS than DNS deserves. Much better: Have
295your MX records generated automatically from control/locals.
296
297If I successfully connect to an MX host but it temporarily refuses to
298accept the message, I give up and put the message back into the queue.
299But several documents seem to suggest that I should try further MX
300records. What are they thinking? My approach deals properly with downed
301hosts, hosts that are unreachable through a firewall, and load
302balancing; what else do people use multiple MX records for?
303
304Currently qmail-remote sends data in 1024-byte buffers. Perhaps it
305should try to take account of the MTU.
306
307Perhaps qmail-remote should allocate a fixed amount of DNS/connect()
308time across any number of MXs; this idea is due to Mark Delany.
309
310RFC 821 doesn't say what it means by ``text.'' qmail-remote assumes that
311the server's reply text doesn't contain bare LFs.
312
313
3147. Delivering mail locally (qmail-lspawn, qmail-local)
315
316qmail-local doesn't support comsat. comsat is a pointless abomination.
317Use qbiff if you want that kind of notification.
318
319The getpwnam() interface hides I/O errors. Solution: qmail-pw2u.
320
321
3228. sendmail V8's new features
323
324sendmail-8.8.0/doc/op/op.me includes a list of big improvements of
325sendmail 8.8.0 over sendmail 5.67. Here's how qmail stacks up against
326each of those improvements. (Of course, qmail has its own improvements,
327but that's not the point of this list.)
328
329Connection caching, MX piggybacking: Nope. (Profile. Don't speculate.)
330
331Response to RCPT command is fast: Yup.
332
333IP addresses show up in Received lines: Yup.
334
335Self domain literal is properly handled: Yup.
336
337Different timeouts for QUIT, RCPT, etc.: No, just a single timeout.
338
339Proper <> handling, route-address pruning: Yes, but not configurable.
340
341ESMTP support: Yup. (Server-side, including PIPELINING.)
342
3438-bit clean: Yup. (Including server-side 8BITMIME support; same as
344sendmail with the 8 option.)
345
346Configurable user database: Yup.
347
348BIND support: Yup.
349
350Keyed files: Yes, in qmsmac.
351
352931/1413/Ident/TAP: Yup.
353
354Correct 822 address list parsing: Yup. (Note that sendmail still has
355some major problems with quoting.)
356
357List-owner handling: Yup.
358
359Dynamic header allocation: Yup.
360
361Minimum number of disk blocks: Yes, via tunefs -m.
362
363Checkpointing: Yes, but not configurable---qmail always checkpoints.
364
365Error message configuration: Nope.
366
367GECOS matching: Not directly, but easy to hook in.
368
369Hop limit configuration: No. (qmail's limit is 100 hops. qmail offers
370automatic loop protection much more advanced than hop counting.)
371
372MIME error messages: No. (qmail uses QSBMF error messages, which are
373much easier to parse.)
374
375Forward file path: Yes, via /etc/passwd.
376
377Incoming SMTP configuration: Yes, via inetd or tcpserver.
378
379Privacy options: Yes, but they're not options.
380
381Best-MX mangling: Nope. See section 6 for further discussion.
382
3837-bit mangling: Nope. qmail always uses 8 bits.
384
385Support for up to 20 MX records: Yes, and more. qmail has no limits
386other than memory.
387
388Correct quoting of name-and-address headers: Yup.
389
390VRFY and EXPN now different: Nope. qmail always hides this information.
391
392Multi-word classes, deferred macro expansion, separate envelope/header
393$g processing, separate per-mailer envelope and header processing, new
394command line flags, new configuration lines, new mailer flags, new
395macros: These are sendmail-specific; they wouldn't even make sense for
396qmail. For example, _of course_ qmail handles envelopes and headers
397separately; they're almost entirely different objects!
398
399
4009. Miscellany
401
402sendmail-clone and qsmhook are too bletcherous to be documented. (The
403official replacement for qsmhook is preline, together with the
404qmail-command environment variables.)
405
406I've considered making install atomic, but this is very difficult to do
407right, and pointless if it isn't done right.
408
409RN suggests automatically putting together a reasonable set of lines for
410/etc/passwd. I perceive this as getting into the adduser business, which
411is worrisome: I'll be lynched the first time I screw up somebody's
412passwd file. This should be left to OS-specific installation scripts.
413
414The BSD 4.2 inetd didn't allow a username. I think I can safely forget
415about this. (DS notes that the username works under Ultrix even though
416it's undocumented.)
417
418I should clean up the bput/put choices.
419
420Some of the stralloc_0()s indicate that certain lower-level routines
421should grok stralloc.
422
423RN suggests having qlist smash the case of the incoming host name.
424
425K1J suggests that mailing list subscription managers should have
426a three-way handshake, to prevent person A from subscribing person B to
427a mailing list. qlist doesn't do this, but ezmlm does.
428
429qmail assumes that all times are positive; that pid_t, time_t and ino_t
430fit into unsigned long; that gid_t fits into int; that the character set
431is ASCII; and that all pointers are interchangeable. Do I care?
432
433The bat book justifies sendmail's insane line-splitting mechanism by
434pointing out that it might be useful for ``a 40-character braille
435print-driving program.'' C'mon, guys, is that your best excuse?
436
437qmail's mascot is a dolphin.