Import ezmlm-idx 0.40
[ezmlm] / sub_mysql / README
CommitLineData
f8beb284
MW
1$Id: README,v 1.3 1999/02/20 20:05:19 lindberg Exp $
2$Name: ezmlm-idx-040 $
3INFORMATION ON BUILDING/USING EZMLM WITH MYSQL SUPPORT
4
5(c) 1999, Frederik Lindberg,
6 lindberg@id.wustl.edu
7 You may use under GPL.
8
9For information on MySQL, see http://www.tcx.se.
10
11Most of this information is available in FAQ.idx.
12
13If you are interested in contributing/testing a subscriber db interface
14for another SQL server, please see sub_std/README and the routines here,
15and contact lindberg@id.wustl.edu (it may already be in process). See end
16of this file for other ways to contribute.
17
18conf-mysql must be edited to reflect your system. On many systems, you
19also need to include ``-lsocket'', as well as change the paths to the
20/usr/local equivalents. For the i386.rpm-based systems, you need at
21least MySQL-devel to build the files. Look at your mysql docs for more info.
22
23TABLES USED FOR (My)SQL SUPPORT
24
25The basic philosophy is that the database can be on any host (if you use
26SENDER restrictions, connectivity to the main host is more important than
27to the sublists), and you choose the database and "table root" names. The
28default database is ``ezmlm'' and the default table root is ``list''. Each
29list has a separate table root. Any number of lists can share a database.
30
31The main list address table is named with the table root only, others have
32that name with various suffixes. In the following ``list'' is used as the
33table root.
34
35ADDRESS TABLES
36list subscriber addresses
37list_slog subscriber address log
38list_allow subscriber aliases for posts on SENDER checked lists.
39list_allow_slog subscriber log for list_allow
40list_deny blacklisted addresses for posts on SENDER checked lists.
41list_deny_slog log for list_deny.
42list_mod moderator addresses
43list_mod_slog log for list_mod
44list_digest subscriber log for digest list.
45list_digest_slog log for list_digest
46
47MESSAGE LOGGING TABLES
48list_cookie message cookie table for main list
49list_mlog message logging table for main list
50list_digest_cookie message cookie table for digest list
51list_digest_mlog message logging table for digest list
52
53SUBLIST SPLIT TABLES
54list_name sublist split table for main list
55list_digest_name sublist split table for digest list.
56
57
58ezmlm-mktab(1) is a script that outputs the table definintions. Look at
59the output for a detailed field description.
60
61The address tables contain (address,domain,hash,h,num). For normal
62lists only the address field is used. For main->sublist clusters, the other
63fields are used for load splitting. The domain is the first up to 3 characters
64of the last part of the domain name. The hash is a address hash [0-52] differnt
65from the one used by ezmlm for splitting within DIR/subscribers. When using
66the address field as a primary key, the size of the index was unreasonable.
67Therefore, ``num'' is used as a dummy primary key, and ``h'' (a 32 bit hash
68of the address) is used as an index. This markedly speeds up (un)sub with
69large (>30,000 rows) subscriber tables.
70
71The *_slog tables contain the same info as DIR/Log, i.e. address, timestamp,
72entry-type, entry-direction, and fromline. The entry-type is the first letter
73of the type of entry (probe, manual, `` '' for normal), entry-direction is
74``+'' for addition, ``-'' for removal. Fromline is the From: header contents
75taken from the subscribe confirm message or from ezmlm-sub (if used with -n).
76It is blank for all address removals, and may be blank also for additions. It
77is used by the list-log.xx command. It is trivial to JOIN this table with the
78address table to get e.g. subsciber names, subscription dates, etc. These
79tables also have the 32-bit hash ``h'' as an index. Joins should be done on
80``h'' as well as ``address'' for better performance.
81
82The *_cookie tables contain message number, timestamp, and cookie. For each
83message a pseudo-random cookie is generated that is ``impossible'' to guess
84beforehand. For lists with sublists, this is used as basic authentication,
85i.e. the sublist will refuse to process a message that doesn't contain the
86correct cookie or that the sublist has already successfully processed.
87
88The *_mlog tables contain log entries from main and sublists. These are
89timestamp, listno, done. Listno is the lowest listnumber for an active list
90entry with the name of this sublist as looked up in the *_name table. Done
91is -1 for bounce, 0 for arrived, 1 for finished processing, and 2 for receipt
92received. The routines are set up so that only the first attempt for each
93combination (listno,code) is logged.
94
95The *_name tables contain listno,name,domain,hash_lo,hash_hi,msgnum_lo,
96msgnum_hi,notuse. Listno is auto_increment and unique. Name is the name of the
97sublist. domain is the last up to 3 characters of the top domain name for
98addresses served by this list (default = ''). It is is '', the list servers
99all_domains_that_are_not_served_by_another list (in addition to domain '').
100Of the addresses that match the domain criterion, the list serves the subset
101with hash between hash_lo and hash_hi (defaults 0, 52). Any entry is ingnored if
102notuse != 0 OR the current message number is not between msgnum_lo and
103msgnum_hi.
104
105For normal lists that are not distributed (i.e. they are a single list),
106entries in the *_name tables are not needed and logging is not very
107relevant.
108
109For most lists, the only addresses that are stored in the SQL database are
110the subscribers of list and digest, and the ``allow'' aliases. It is NOT
111normally advisable to store moderator addresses there, since they are
112needed only at the main list and secrecy is more important. ``Deny'' addresses
113are few and again only needed at the main list. ``Allow'' are put in the
114SQL database when using the default ezmlmrc file only to make all relevant
115addresses manipulatable via the SQL server. The other tables are created, in
116case they are wanted (the cost for having them as empty table is zero). The
117basedir/sql file is the decision point. If it exists, an SQL table is used;
118if not a local ezmlm db is used.
119
120CONTRIBUTIONS REQESTED
121
122I would be very grateful if there are users out there willing to do any of
123the following and contribute it to this package. Please check with me first
124(lindberg@id.wustl.edu), as the project may already be in progress/done.
125
1261. Interfaces for other SQL servers. Oracle, SyBase, ...
127
1282. A GUI admin utility to add/remove/manipulate the sublist split, essentially
129 by modifying list_[digest_]name in a safe way. Ideally WWW if it can be
130 done securely. If you use some standard interface (JDBC/DBD) it would be
131 useful also with other SQL severs. This could even be an Access program
132 using ODBC, although writing it for a platform running qmail/ezmlm makes
133 most sense.
134
1353. a WWW GUI that allows users to subscribe/unsubscribe in a safe way. A random
136 password would be created the first time and stored in a new address->pw
137 table and mailed to the subscriber address. With that password, the user
138 would be able to [un]subscribe to lists, edit the name (for compatibility
139 implemented by adding a subscribe line to list_[digest]slog). Add/remove
140 aliases. Ideally, it should also allow searching by subscriber name. This
141 would search *_slog.fromline. If less that 'x' alternatives are found, the
142 user would be presented with names (not addresses), allowing the user to
143 cause the subscription name and password to be sent to the respective
144 subscription address. With that info, the subscriber can then unsubscribe,
145 even if s/he has forgotten the subscription address. It is complicated
146 slightly by the fact that ``fromline'' is the crude line and needs to be
147 rfc822 parsed. Again, use of a standard interface is encouraged to make it
148 compatible also with other SQL servers.
149
150The aim of all this is to make it easy to use ezmlm to run very large lists,
151easy to set up sites that handle subscriber interaction, archive access, etc,
152and hopefully easier to integrate many ezmlm as done by some WWW sites today.
153