| 1 | Please note that this file is not called ``Internet Mail For Dummies.'' |
| 2 | It _records_ my thoughts on various issues. It does not _explain_ them. |
| 3 | Paragraphs are not organized except by section. The required background |
| 4 | varies wildly from one paragraph to the next. |
| 5 | |
| 6 | In this file, ``sendmail'' means Allman's creation; ``sendmail-clone'' |
| 7 | means the program in this package. |
| 8 | |
| 9 | |
| 10 | 1. Security |
| 11 | |
| 12 | There are lots of interesting remote denial-of-service attacks on any |
| 13 | mail system. A long-term solution is to insist on prepayment for |
| 14 | unauthorized resource use. The tricky technical problem is to make the |
| 15 | prepayment enforcement mechanism cheaper than the expected cost of the |
| 16 | attacks. (For local denial-of-service attacks it's enough to be able to |
| 17 | figure out which user is responsible.) |
| 18 | |
| 19 | qmail-send's log was originally designed for profiling. It subsequently |
| 20 | sprouted some tracing features. However, there's no way to verify |
| 21 | securely that a particular message came from a particular local user; |
| 22 | how do you know the recipient is telling you the truth about the |
| 23 | contents of the message? With QUEUE_EXTRA it'd be possible to record a |
| 24 | one-way hash of each outgoing message, but a user who wants to send |
| 25 | ``bad'' mail can avoid qmail entirely. |
| 26 | |
| 27 | I originally decided on security grounds not to put qmail advertisements |
| 28 | into SMTP responses: advertisements often act as version identifiers. |
| 29 | But this problem went away when I found a stable qmail URL. |
| 30 | |
| 31 | As qmail grows in popularity, the mere knowledge that rcpthosts is so |
| 32 | easily available will deter people from setting up unauthorized MXs. |
| 33 | (I've never seen an unauthorized MX, but I can imagine that it would be |
| 34 | rather annoying.) Note that, unlike the bat book checkcompat() kludge, |
| 35 | rcpthosts doesn't interfere with mailing lists. |
| 36 | |
| 37 | qmail-start doesn't bother with tty dissociation. On some old machines |
| 38 | this means that random people can send tty signals to the qmail daemons. |
| 39 | That's a security flaw in the job control subsystem, not in qmail. |
| 40 | |
| 41 | The resolver library isn't too bloated (before 4.9.4, at least), but it |
| 42 | uses stdio, which _is_ bloated. Reading /etc/resolv.conf costs lots of |
| 43 | memory in each qmail-remote process. So it's tempting to incorporate a |
| 44 | smaller resolver library into qmail. (Bonus: I'd avoid system-specific |
| 45 | problems with old resolvers.) The problem is that I'd then be writing a |
| 46 | fundamentally insecure library. I'd no longer be able to blame the BIND |
| 47 | authors and vendors for the fact that attackers can easily use DNS to |
| 48 | steal mail. Possible solution: replace dns.c with something that passes |
| 49 | requests (reliably!) to a local daemon; call the original resolver |
| 50 | library from that daemon. |
| 51 | |
| 52 | NFS is the primary enemy of security partitioning under UNIX. Here's the |
| 53 | story. Sun knew from the start that NFS was completely insecure. It |
| 54 | tried to hide that fact by disallowing root access over NFS. Intruders |
| 55 | nevertheless broke into system after system, first obtaining bin access |
| 56 | and then obtaining root access. Various people thus decided to compound |
| 57 | Sun's error and build a wall between root and all other users: if all |
| 58 | system files are owned by root, and if there are no security holes other |
| 59 | than NFS, someone who breaks in via NFS won't be able to wipe out the |
| 60 | operating system---he'll merely be able to wipe out all user files. This |
| 61 | clueless policy means that, for example, all the qmail users have to be |
| 62 | replaced by root. See what I mean by ``enemy''? ... Basic NFS comments: |
| 63 | Aside from the cryptographic problem of having hosts communicate |
| 64 | securely, it's obvious that there's an administrative problem of mapping |
| 65 | client uids to server uids. If a host is secure and under your control, |
| 66 | you shouldn't have to map anything. If a host is under someone else's |
| 67 | control, you'll want to map his uids to one local account; it's his |
| 68 | client's job to decide which of his users get to talk NFS in the first |
| 69 | place. Sun's original map---root to nobody, everyone else left alone--- |
| 70 | is, as far as I can tell, always wrong. |
| 71 | |
| 72 | |
| 73 | 2. Injecting mail locally (qmail-inject, sendmail-clone) |
| 74 | |
| 75 | RFC 822 section 3.4.9 prohibits certain visual effects in headers. |
| 76 | qmail-inject doesn't waste the time to enforce this absurd restriction. |
| 77 | If you will suffer from someone sending you ``flash mail,'' go find a |
| 78 | better mail reader. |
| 79 | |
| 80 | qmail-inject's ``Cc: recipient list not shown: ;'' successfully stops |
| 81 | sendmail from adding Apparently-To. Unfortunately, old versions of |
| 82 | sendmail will append a host name. This wasn't fixed until sendmail 8.7. |
| 83 | How many years has it been since RFC 822 came out? |
| 84 | |
| 85 | sendmail discards duplicate addresses. This has probably resulted in |
| 86 | more lost and stolen mail over the years than the entire Chicago branch |
| 87 | of the United States Postal Service. The qmail system delivers messages |
| 88 | exactly as it's told to do. Along the same lines: qmail-inject is both |
| 89 | unable and unwilling to support anything like sendmail's (default) |
| 90 | nometoo option. Of course, a list manager could support nometoo. |
| 91 | |
| 92 | There should be a mechanism in qmail-inject that does for envelope |
| 93 | recipients what Return-Path does for the envelope sender. Then |
| 94 | qmail-inject -n could print the recipients. |
| 95 | |
| 96 | Should qmail-inject bounce messages with no recipients? Should there be |
| 97 | an option for this? If it stays as is (accept the message), qmail-inject |
| 98 | could at least avoid invoking qmail-queue. |
| 99 | |
| 100 | It is possible to extract non-unique Message-IDs out of qmail-inject. |
| 101 | Here's how: stop qmail-inject before it gets to the third line of |
| 102 | main(), then wait until the pids wrap around, then restart qmail-inject |
| 103 | and blast the message through, then start another qmail-inject with the |
| 104 | same pid in the same second. I'm not sure how to fix this. (Of course, |
| 105 | the user could just type in his own non-unique Message-IDs.) |
| 106 | |
| 107 | The bat book says: ``Rules that hide hosts in a domain should be applied |
| 108 | only to sender addresses.'' Recipient masquerading works fine with |
| 109 | qmail. None of sendmail's pitfalls apply, basically because qmail has a |
| 110 | straight paper path. |
| 111 | |
| 112 | I expect to receive some pressure to make up for the failings of MUA |
| 113 | writers who don't understand the concept of reliability. (``Like, duh, |
| 114 | you mean I was supposed to check the sendmail exit code?'') |
| 115 | |
| 116 | |
| 117 | 3. Receiving mail from the network (tcp-env, qmail-smtpd) |
| 118 | |
| 119 | RFC 1123 requires VRFY support, but says that it's okay if an |
| 120 | implementation can be configured to not allow VRFY. qmail-smtpd doesn't |
| 121 | allow VRFY. If you desperately want your SMTP server (i.e., inetd) to |
| 122 | provide useful information for VRFY, just compile and install sendmail. |
| 123 | Were the RFC 1123 writers aware of the as-if principle of interface |
| 124 | specification? ... They say that VRFY and EXPN are important for |
| 125 | tracking down cross-host mailing list loops. Catch up to the 1990s, |
| 126 | guys: with Delivered-To, mailing list loops do absolutely no damage, |
| 127 | _and_ one of the list administrators gets a bounce that shows exactly |
| 128 | how the loop occurred. Solve the problem, not the symptom. ... There's a |
| 129 | vastly superior alternative to EXPN. Hint: finger postmaster@ai.mit.edu. |
| 130 | |
| 131 | Should dns.c make special allowances for 127.0.0.1/localhost? |
| 132 | |
| 133 | badmailfrom (like 8BITMIME) is a waste of code space. |
| 134 | |
| 135 | |
| 136 | 4. Adding messages to the queue (qmail-queue) |
| 137 | |
| 138 | Should qmail-queue try to make sure enough disk space is free in |
| 139 | advance? When qmail-queue is invoked by qmail-local or (with ESMTP) |
| 140 | qmail-smtpd or qmail-qmtpd, it could be told a size in advance. I wish |
| 141 | UNIX had an atomic allocate-disk-space routine... |
| 142 | |
| 143 | The qmail.h interface (reflecting the qmail-queue interface, which in |
| 144 | turn reflects the current queue file structure) is constitutionally |
| 145 | incapable of handling an address that contains a 0 byte. I can't imagine |
| 146 | that this will be a problem. |
| 147 | |
| 148 | Should qmail-queue not bother queueing a message with no recipients? |
| 149 | |
| 150 | |
| 151 | 5. Handling queued mail (qmail-send, qmail-clean) |
| 152 | |
| 153 | The queue directory must be local. Mounting it over NFS is extremely |
| 154 | dangerous---not that this stops people from running sendmail that way! |
| 155 | Perhaps it is worth putting together a diskless-host qmail package with |
| 156 | just qmail-inject and an SMTP client in place of qmail-queue. Sending |
| 157 | mail to the server via SMTP is of course vastly better than trying to do |
| 158 | anything over NFS. If the NFS server is up but the mail server is down, |
| 159 | users will just have to wait. |
| 160 | |
| 161 | Queue reliability demands that single-byte writes be atomic. This is |
| 162 | true for a fixed-block filesystem such as UFS, and for a logging |
| 163 | filesystem such as LFS. |
| 164 | |
| 165 | qmail-send uses 8 bytes of memory per queued message. Double that for |
| 166 | reallocation. (Fix: use a small forest of heaps; i.e., keep several |
| 167 | prioqs.) Double again for buddy malloc()s. (Fix: be clever about the |
| 168 | heap sizes.) 32 bytes is worrisome, but not devastating. Even on my |
| 169 | disk-heavy memory-light machine, I'd run out of inodes long before |
| 170 | running out of memory. |
| 171 | |
| 172 | Some mail systems organize the queue by host. This is pointless as a |
| 173 | means of splitting up the queue directory. The real issue is what to do |
| 174 | when you suddenly find out that a host is up. For local SLIP/PPP links |
| 175 | you know in advance which hosts need this treatment, so you can handle |
| 176 | them with virtualdomains and serialmail. |
| 177 | |
| 178 | For the old queue structure I implemented recipient list compression: |
| 179 | if mail goes out to a giant mailing list, and most of the recipients are |
| 180 | delivered, make a new, compressed, todo list. But this really isn't |
| 181 | worth the effort: it saves only a tiny bit of CPU time. |
| 182 | |
| 183 | qmail-send doesn't have any notions of precedence, priority, fairness, |
| 184 | importance, etc. It handles the queue in first-seen-first-served order. |
| 185 | One could put a lot of work into doing something different, but that |
| 186 | work would be a waste: given the triggering mechanism and qmail's |
| 187 | deferral strategy, it is exceedingly rare for the queue to contain more |
| 188 | than one deliverable message at any given moment. |
| 189 | |
| 190 | Exception: Even with all the concurrency tricks, qmail-send can end up |
| 191 | spending a few minutes on a mailing list with thousands of remote |
| 192 | entries. A user might send a new message to a remote address in the |
| 193 | meantime. Perhaps qmail-send should limit its time per message to, |
| 194 | say, thirty recipients. This will require some way to mark recipients |
| 195 | who were already done on this pass. Possible approach: Maintain two todo |
| 196 | lists (for both L and R). Always work on the earlier todo list. Move |
| 197 | deferrals to the other todo list. |
| 198 | |
| 199 | qmail-send will never start a pass for a job that it already has. This |
| 200 | means that, if one delivery takes longer than the retry interval, the |
| 201 | next pass will be delayed. I implemented the opposite strategy for the |
| 202 | old queue structure. Some hassles: mark() had to understand how job |
| 203 | input was buffered; every new delivery had to check whether the same |
| 204 | mpos in the same message was already being done. |
| 205 | |
| 206 | Some things that qmail-send does synchronously: queueing a bounce |
| 207 | message; doing a cleanup via qmail-clean; classifying and rewriting all |
| 208 | the addresses in a new message. As usual, making these asynchronous |
| 209 | would require some housekeeping, but could speed things up a bit. |
| 210 | (Making bounces asynchronous, without POSIX waitpid(), means that |
| 211 | wait_pid() has to keep a buffer of previous wait()s. Ugh.) |
| 212 | |
| 213 | fsync() is a bottleneck. To make this asynchronous would require gobs of |
| 214 | dedicated output processes whose only purpose in life is to watch data |
| 215 | get written to the disk. Inconceivable! (``You keep using that word. I |
| 216 | do not think that word means what you think it means.'') |
| 217 | |
| 218 | On the other hand, I could survive without fsync()ing the local and |
| 219 | remote and info files as long as I don't unlink todo. This would require |
| 220 | redefining the queue states. I need to see how much speed can be gained. |
| 221 | |
| 222 | Currently qmail-send sends at most one bounce message for each incoming |
| 223 | message. This means that the sender doesn't get flooded with copies of |
| 224 | his own message. On the other hand, a single slow address can hold up |
| 225 | bounces for a bunch of fast addresses. It would be easy to call |
| 226 | injectbounce() more often. What is the best strategy? This feels like |
| 227 | the TCP-buffering issue... don't want to pepper the other guy with |
| 228 | little packets, but do want to get the data across. |
| 229 | |
| 230 | qmail-stop implementation: setuid to UID_SEND; kill -TERM -1. Given how |
| 231 | simple this is, I'm not inclined to set up some tricky locking solution |
| 232 | where qmail-send records its pid etc. But I just know that, if I provide |
| 233 | this qmail-stop program, someone will screw himself by making another |
| 234 | uid the same as UID_SEND, or making UID_SEND be root, or whatever. |
| 235 | Aargh. Maybe use another named pipe... New solution: Run qmail-start |
| 236 | under an external service controller---it runs in the foreground now. |
| 237 | |
| 238 | Bounce messages could include more statistical information in the first |
| 239 | paragraph: when I received the message, how many recipients I was |
| 240 | supposed to handle, how many I successfully dealt with, how many I |
| 241 | already told you about, how many are still in the queue. Have to |
| 242 | emphasize that the number of recipients _here_ is perhaps less than the |
| 243 | number of recipients on the original message. |
| 244 | |
| 245 | The readdir() interface hides I/O errors. Lower-level interfaces would |
| 246 | lead me into a thicket of portability problems. I'm really not sure what |
| 247 | to do about this. Of course, a hard I/O error means that mail is toast, |
| 248 | but a soft I/O error shouldn't cause any trouble. |
| 249 | |
| 250 | job_open() or pass_dochan() could be paranoid about the same id,channel |
| 251 | already being open; but, since messdone() is so paranoid, the worst |
| 252 | possible effect of a bug along these lines would be double delivery. |
| 253 | |
| 254 | Mathematical amusement: The optimal retry schedule is essentially, |
| 255 | though not exactly, independent of the actual distribution of message |
| 256 | delay times. What really matters is how much cost you assign to retries |
| 257 | and to particular increases in latency. qmail's current quadratic retry |
| 258 | schedule says that an hour-long delay in a day-old message is worth the |
| 259 | same as a ten-minute delay in an hour-old message; this doesn't seem so |
| 260 | unreasonable. |
| 261 | |
| 262 | Insider information: AOL retries their messages every five minutes for |
| 263 | three days straight. Hmmm. |
| 264 | |
| 265 | |
| 266 | 6. Sending mail through the network (qmail-rspawn, qmail-remote) |
| 267 | |
| 268 | Are there any hosts, anywhere, whose mailers are bogged down by huge |
| 269 | messages to multiple recipients at a single host? For typical hosts, |
| 270 | multiple RCPTs per SMTP aren't an ``efficiency feature''; they're a |
| 271 | _slowness_ feature. Separate SMTP transactions have much lower latency. |
| 272 | |
| 273 | The multiple-RCPT bandwidth gain _might_ be noticeable for a machine |
| 274 | that sends most messages to a smarthost. It would be easy to have |
| 275 | qmail-rspawn supply qmail-remote with all the addresses at once, as long |
| 276 | as qmail-send says when it's about to block... Putting recipients into |
| 277 | the right order is clearly the UA's job. One multiple-RCPT pitfall is |
| 278 | that a remote host might not be able to deal with (say) 10 recipients, |
| 279 | even though RFC 821 says everyone has to be able to handle 100; |
| 280 | qmail-rspawn would have to notice this and back off. (Not that other |
| 281 | mailers do. Sometimes I'm amazed Internet mail works at all.) |
| 282 | |
| 283 | In the opposite direction: It's tempting to remove the @host part of the |
| 284 | qmail-remote recip argument. Or at least avoid double-dns_cname. |
| 285 | |
| 286 | There are lots of reasons that qmail-rspawn should take a more active |
| 287 | role in qmail-remote's activities. It should call separate programs to |
| 288 | do (1) MX lookups, (2) SMTP connections, (3) QMTP connections. |
| 289 | |
| 290 | I bounce ambiguous MXs. (An ``ambiguous MX'' is a best-preference MX |
| 291 | record sending me mail for a host that I don't recognize as local.) |
| 292 | Automatically treating ambiguous MXs as local is incompatible with my |
| 293 | design decision to keep local delivery working when the network goes |
| 294 | down. It puts more faith in DNS than DNS deserves. Much better: Have |
| 295 | your MX records generated automatically from control/locals. |
| 296 | |
| 297 | If I successfully connect to an MX host but it temporarily refuses to |
| 298 | accept the message, I give up and put the message back into the queue. |
| 299 | But several documents seem to suggest that I should try further MX |
| 300 | records. What are they thinking? My approach deals properly with downed |
| 301 | hosts, hosts that are unreachable through a firewall, and load |
| 302 | balancing; what else do people use multiple MX records for? |
| 303 | |
| 304 | Currently qmail-remote sends data in 1024-byte buffers. Perhaps it |
| 305 | should try to take account of the MTU. |
| 306 | |
| 307 | Perhaps qmail-remote should allocate a fixed amount of DNS/connect() |
| 308 | time across any number of MXs; this idea is due to Mark Delany. |
| 309 | |
| 310 | RFC 821 doesn't say what it means by ``text.'' qmail-remote assumes that |
| 311 | the server's reply text doesn't contain bare LFs. |
| 312 | |
| 313 | |
| 314 | 7. Delivering mail locally (qmail-lspawn, qmail-local) |
| 315 | |
| 316 | qmail-local doesn't support comsat. comsat is a pointless abomination. |
| 317 | Use qbiff if you want that kind of notification. |
| 318 | |
| 319 | The getpwnam() interface hides I/O errors. Solution: qmail-pw2u. |
| 320 | |
| 321 | |
| 322 | 8. sendmail V8's new features |
| 323 | |
| 324 | sendmail-8.8.0/doc/op/op.me includes a list of big improvements of |
| 325 | sendmail 8.8.0 over sendmail 5.67. Here's how qmail stacks up against |
| 326 | each of those improvements. (Of course, qmail has its own improvements, |
| 327 | but that's not the point of this list.) |
| 328 | |
| 329 | Connection caching, MX piggybacking: Nope. (Profile. Don't speculate.) |
| 330 | |
| 331 | Response to RCPT command is fast: Yup. |
| 332 | |
| 333 | IP addresses show up in Received lines: Yup. |
| 334 | |
| 335 | Self domain literal is properly handled: Yup. |
| 336 | |
| 337 | Different timeouts for QUIT, RCPT, etc.: No, just a single timeout. |
| 338 | |
| 339 | Proper <> handling, route-address pruning: Yes, but not configurable. |
| 340 | |
| 341 | ESMTP support: Yup. (Server-side, including PIPELINING.) |
| 342 | |
| 343 | 8-bit clean: Yup. (Including server-side 8BITMIME support; same as |
| 344 | sendmail with the 8 option.) |
| 345 | |
| 346 | Configurable user database: Yup. |
| 347 | |
| 348 | BIND support: Yup. |
| 349 | |
| 350 | Keyed files: Yes, in qmsmac. |
| 351 | |
| 352 | 931/1413/Ident/TAP: Yup. |
| 353 | |
| 354 | Correct 822 address list parsing: Yup. (Note that sendmail still has |
| 355 | some major problems with quoting.) |
| 356 | |
| 357 | List-owner handling: Yup. |
| 358 | |
| 359 | Dynamic header allocation: Yup. |
| 360 | |
| 361 | Minimum number of disk blocks: Yes, via tunefs -m. |
| 362 | |
| 363 | Checkpointing: Yes, but not configurable---qmail always checkpoints. |
| 364 | |
| 365 | Error message configuration: Nope. |
| 366 | |
| 367 | GECOS matching: Not directly, but easy to hook in. |
| 368 | |
| 369 | Hop limit configuration: No. (qmail's limit is 100 hops. qmail offers |
| 370 | automatic loop protection much more advanced than hop counting.) |
| 371 | |
| 372 | MIME error messages: No. (qmail uses QSBMF error messages, which are |
| 373 | much easier to parse.) |
| 374 | |
| 375 | Forward file path: Yes, via /etc/passwd. |
| 376 | |
| 377 | Incoming SMTP configuration: Yes, via inetd or tcpserver. |
| 378 | |
| 379 | Privacy options: Yes, but they're not options. |
| 380 | |
| 381 | Best-MX mangling: Nope. See section 6 for further discussion. |
| 382 | |
| 383 | 7-bit mangling: Nope. qmail always uses 8 bits. |
| 384 | |
| 385 | Support for up to 20 MX records: Yes, and more. qmail has no limits |
| 386 | other than memory. |
| 387 | |
| 388 | Correct quoting of name-and-address headers: Yup. |
| 389 | |
| 390 | VRFY and EXPN now different: Nope. qmail always hides this information. |
| 391 | |
| 392 | Multi-word classes, deferred macro expansion, separate envelope/header |
| 393 | $g processing, separate per-mailer envelope and header processing, new |
| 394 | command line flags, new configuration lines, new mailer flags, new |
| 395 | macros: These are sendmail-specific; they wouldn't even make sense for |
| 396 | qmail. For example, _of course_ qmail handles envelopes and headers |
| 397 | separately; they're almost entirely different objects! |
| 398 | |
| 399 | |
| 400 | 9. Miscellany |
| 401 | |
| 402 | sendmail-clone and qsmhook are too bletcherous to be documented. (The |
| 403 | official replacement for qsmhook is preline, together with the |
| 404 | qmail-command environment variables.) |
| 405 | |
| 406 | I've considered making install atomic, but this is very difficult to do |
| 407 | right, and pointless if it isn't done right. |
| 408 | |
| 409 | RN suggests automatically putting together a reasonable set of lines for |
| 410 | /etc/passwd. I perceive this as getting into the adduser business, which |
| 411 | is worrisome: I'll be lynched the first time I screw up somebody's |
| 412 | passwd file. This should be left to OS-specific installation scripts. |
| 413 | |
| 414 | The BSD 4.2 inetd didn't allow a username. I think I can safely forget |
| 415 | about this. (DS notes that the username works under Ultrix even though |
| 416 | it's undocumented.) |
| 417 | |
| 418 | I should clean up the bput/put choices. |
| 419 | |
| 420 | Some of the stralloc_0()s indicate that certain lower-level routines |
| 421 | should grok stralloc. |
| 422 | |
| 423 | RN suggests having qlist smash the case of the incoming host name. |
| 424 | |
| 425 | K1J suggests that mailing list subscription managers should have |
| 426 | a three-way handshake, to prevent person A from subscribing person B to |
| 427 | a mailing list. qlist doesn't do this, but ezmlm does. |
| 428 | |
| 429 | qmail assumes that all times are positive; that pid_t, time_t and ino_t |
| 430 | fit into unsigned long; that gid_t fits into int; that the character set |
| 431 | is ASCII; and that all pointers are interchangeable. Do I care? |
| 432 | |
| 433 | The bat book justifies sendmail's insane line-splitting mechanism by |
| 434 | pointing out that it might be useful for ``a 40-character braille |
| 435 | print-driving program.'' C'mon, guys, is that your best excuse? |
| 436 | |
| 437 | qmail's mascot is a dolphin. |