Commit | Line | Data |
---|---|---|
f8beb284 MW |
1 | $Id: README,v 1.3 1999/02/20 20:05:19 lindberg Exp $ |
2 | $Name: ezmlm-idx-040 $ | |
3 | INFORMATION ON BUILDING/USING EZMLM WITH MYSQL SUPPORT | |
4 | ||
5 | (c) 1999, Frederik Lindberg, | |
6 | lindberg@id.wustl.edu | |
7 | You may use under GPL. | |
8 | ||
9 | For information on MySQL, see http://www.tcx.se. | |
10 | ||
11 | Most of this information is available in FAQ.idx. | |
12 | ||
13 | If you are interested in contributing/testing a subscriber db interface | |
14 | for another SQL server, please see sub_std/README and the routines here, | |
15 | and contact lindberg@id.wustl.edu (it may already be in process). See end | |
16 | of this file for other ways to contribute. | |
17 | ||
18 | conf-mysql must be edited to reflect your system. On many systems, you | |
19 | also need to include ``-lsocket'', as well as change the paths to the | |
20 | /usr/local equivalents. For the i386.rpm-based systems, you need at | |
21 | least MySQL-devel to build the files. Look at your mysql docs for more info. | |
22 | ||
23 | TABLES USED FOR (My)SQL SUPPORT | |
24 | ||
25 | The basic philosophy is that the database can be on any host (if you use | |
26 | SENDER restrictions, connectivity to the main host is more important than | |
27 | to the sublists), and you choose the database and "table root" names. The | |
28 | default database is ``ezmlm'' and the default table root is ``list''. Each | |
29 | list has a separate table root. Any number of lists can share a database. | |
30 | ||
31 | The main list address table is named with the table root only, others have | |
32 | that name with various suffixes. In the following ``list'' is used as the | |
33 | table root. | |
34 | ||
35 | ADDRESS TABLES | |
36 | list subscriber addresses | |
37 | list_slog subscriber address log | |
38 | list_allow subscriber aliases for posts on SENDER checked lists. | |
39 | list_allow_slog subscriber log for list_allow | |
40 | list_deny blacklisted addresses for posts on SENDER checked lists. | |
41 | list_deny_slog log for list_deny. | |
42 | list_mod moderator addresses | |
43 | list_mod_slog log for list_mod | |
44 | list_digest subscriber log for digest list. | |
45 | list_digest_slog log for list_digest | |
46 | ||
47 | MESSAGE LOGGING TABLES | |
48 | list_cookie message cookie table for main list | |
49 | list_mlog message logging table for main list | |
50 | list_digest_cookie message cookie table for digest list | |
51 | list_digest_mlog message logging table for digest list | |
52 | ||
53 | SUBLIST SPLIT TABLES | |
54 | list_name sublist split table for main list | |
55 | list_digest_name sublist split table for digest list. | |
56 | ||
57 | ||
58 | ezmlm-mktab(1) is a script that outputs the table definintions. Look at | |
59 | the output for a detailed field description. | |
60 | ||
61 | The address tables contain (address,domain,hash,h,num). For normal | |
62 | lists only the address field is used. For main->sublist clusters, the other | |
63 | fields are used for load splitting. The domain is the first up to 3 characters | |
64 | of the last part of the domain name. The hash is a address hash [0-52] differnt | |
65 | from the one used by ezmlm for splitting within DIR/subscribers. When using | |
66 | the address field as a primary key, the size of the index was unreasonable. | |
67 | Therefore, ``num'' is used as a dummy primary key, and ``h'' (a 32 bit hash | |
68 | of the address) is used as an index. This markedly speeds up (un)sub with | |
69 | large (>30,000 rows) subscriber tables. | |
70 | ||
71 | The *_slog tables contain the same info as DIR/Log, i.e. address, timestamp, | |
72 | entry-type, entry-direction, and fromline. The entry-type is the first letter | |
73 | of the type of entry (probe, manual, `` '' for normal), entry-direction is | |
74 | ``+'' for addition, ``-'' for removal. Fromline is the From: header contents | |
75 | taken from the subscribe confirm message or from ezmlm-sub (if used with -n). | |
76 | It is blank for all address removals, and may be blank also for additions. It | |
77 | is used by the list-log.xx command. It is trivial to JOIN this table with the | |
78 | address table to get e.g. subsciber names, subscription dates, etc. These | |
79 | tables also have the 32-bit hash ``h'' as an index. Joins should be done on | |
80 | ``h'' as well as ``address'' for better performance. | |
81 | ||
82 | The *_cookie tables contain message number, timestamp, and cookie. For each | |
83 | message a pseudo-random cookie is generated that is ``impossible'' to guess | |
84 | beforehand. For lists with sublists, this is used as basic authentication, | |
85 | i.e. the sublist will refuse to process a message that doesn't contain the | |
86 | correct cookie or that the sublist has already successfully processed. | |
87 | ||
88 | The *_mlog tables contain log entries from main and sublists. These are | |
89 | timestamp, listno, done. Listno is the lowest listnumber for an active list | |
90 | entry with the name of this sublist as looked up in the *_name table. Done | |
91 | is -1 for bounce, 0 for arrived, 1 for finished processing, and 2 for receipt | |
92 | received. The routines are set up so that only the first attempt for each | |
93 | combination (listno,code) is logged. | |
94 | ||
95 | The *_name tables contain listno,name,domain,hash_lo,hash_hi,msgnum_lo, | |
96 | msgnum_hi,notuse. Listno is auto_increment and unique. Name is the name of the | |
97 | sublist. domain is the last up to 3 characters of the top domain name for | |
98 | addresses served by this list (default = ''). It is is '', the list servers | |
99 | all_domains_that_are_not_served_by_another list (in addition to domain ''). | |
100 | Of the addresses that match the domain criterion, the list serves the subset | |
101 | with hash between hash_lo and hash_hi (defaults 0, 52). Any entry is ingnored if | |
102 | notuse != 0 OR the current message number is not between msgnum_lo and | |
103 | msgnum_hi. | |
104 | ||
105 | For normal lists that are not distributed (i.e. they are a single list), | |
106 | entries in the *_name tables are not needed and logging is not very | |
107 | relevant. | |
108 | ||
109 | For most lists, the only addresses that are stored in the SQL database are | |
110 | the subscribers of list and digest, and the ``allow'' aliases. It is NOT | |
111 | normally advisable to store moderator addresses there, since they are | |
112 | needed only at the main list and secrecy is more important. ``Deny'' addresses | |
113 | are few and again only needed at the main list. ``Allow'' are put in the | |
114 | SQL database when using the default ezmlmrc file only to make all relevant | |
115 | addresses manipulatable via the SQL server. The other tables are created, in | |
116 | case they are wanted (the cost for having them as empty table is zero). The | |
117 | basedir/sql file is the decision point. If it exists, an SQL table is used; | |
118 | if not a local ezmlm db is used. | |
119 | ||
120 | CONTRIBUTIONS REQESTED | |
121 | ||
122 | I would be very grateful if there are users out there willing to do any of | |
123 | the following and contribute it to this package. Please check with me first | |
124 | (lindberg@id.wustl.edu), as the project may already be in progress/done. | |
125 | ||
126 | 1. Interfaces for other SQL servers. Oracle, SyBase, ... | |
127 | ||
128 | 2. A GUI admin utility to add/remove/manipulate the sublist split, essentially | |
129 | by modifying list_[digest_]name in a safe way. Ideally WWW if it can be | |
130 | done securely. If you use some standard interface (JDBC/DBD) it would be | |
131 | useful also with other SQL severs. This could even be an Access program | |
132 | using ODBC, although writing it for a platform running qmail/ezmlm makes | |
133 | most sense. | |
134 | ||
135 | 3. a WWW GUI that allows users to subscribe/unsubscribe in a safe way. A random | |
136 | password would be created the first time and stored in a new address->pw | |
137 | table and mailed to the subscriber address. With that password, the user | |
138 | would be able to [un]subscribe to lists, edit the name (for compatibility | |
139 | implemented by adding a subscribe line to list_[digest]slog). Add/remove | |
140 | aliases. Ideally, it should also allow searching by subscriber name. This | |
141 | would search *_slog.fromline. If less that 'x' alternatives are found, the | |
142 | user would be presented with names (not addresses), allowing the user to | |
143 | cause the subscription name and password to be sent to the respective | |
144 | subscription address. With that info, the subscriber can then unsubscribe, | |
145 | even if s/he has forgotten the subscription address. It is complicated | |
146 | slightly by the fact that ``fromline'' is the crude line and needs to be | |
147 | rfc822 parsed. Again, use of a standard interface is encouraged to make it | |
148 | compatible also with other SQL servers. | |
149 | ||
150 | The aim of all this is to make it easy to use ezmlm to run very large lists, | |
151 | easy to set up sites that handle subscriber interaction, archive access, etc, | |
152 | and hopefully easier to integrate many ezmlm as done by some WWW sites today. | |
153 |