Miscellaneous fixes.
[u/mdw/catacomb] / README.cipher
CommitLineData
759513d1 1Symmetric ciphers
2
3
4 Catacomb provides a number of symmetric ciphers, together with a
5 generic cipher interface. More ciphers may be added later.
6
7
8Block cipher interface
9
10 There are a number of block ciphers implemented, all with
11 extremely similar interfaces. However, block ciphers aren't
12 actually at all pleasant to use directly. They're really
13 intended to be used only by higher-level `modes'.
14
15 Anyway, I'll take Bruce Schneier's Blowfish as an example.
16
17 Before doing any encryption or decryption, you need to
18 initialize a `context'. The context data is stored in an object
19 of type `blowfish_ctx'. You initialize the context by calling
20 `blowfish_init' with the address of the context, the address of
21 some key data, and the length of the key.
22
23 Data is encrypted using `blowfish_eblk' and decrypted by
24 `blowfish_dblk'. Both functions are given data as an array of
25 `uint32' words. (Since Blowfish uses 64-bit blocks, you give it
26 arrays of two words.)
27
28 A number of constants are defined to describe further properties
29 of the cipher:
30
31 BLOWFISH_KEYSZ Is zero, to indicate that Blowfish doesn't care
32 much about the size of key you give it.
33
34 BLOWFISH_BLKSZ Is 8, because Blowfish works on 64-bit blocks,
35 which are therefore 8 bytes wide.
36
37 BLOWFISH_CLASS Is the triple (N, B, 64). This is explained
38 below.
39
40 The BLOWFISH_CLASS macro contains information useful to other
41 macros, rather than to direct users of the interface. The three
42 components are:
43
44 The `type' Simply N if specific macros for handling blocks
45 of the appropriate width have been written, or X
46 if the macros should use a loop instead.
47
48 The `endianness'
49 Either `B' for big-endian, or L for little-
50 endian.
51
52 The `width' The cipher's block size in bits.
53
54 This simple interface is thoroughly inconvenient for general
55 use, although it makes writing block ciphers very easy.
56
57
58 The peculiarities of the various ciphers are described below.
59
60 Blowfish Fairly standard, really. Accepts arbitrary-
61 sized keys up to 448 bits. (The original
62 definition only specified keys with a multiple
63 of 32 bits -- the extension I use is due, I
64 think, to Eric Young.) Blowfish is fast and
65 looks very secure.
66
67 IDEA Requires a 128-bit key. Not very fast. No
68 known attacks on the full cipher. Used in
69 PGP2. Patented!
70
71 DES Good old reliable. Been around for donkey's
72 years and still going. Single-DES (implemented
73 here) has a small key, but apart from that is
74 looking remarkably robust. Uses a 56-bit key
75 which may be either 8 bytes with (ignored)
76 parity bits in the bottom bit of each byte, or 7
77 bytes with no parity.
78
79 DES3 Two- or three-key triple DES. Slow, but strong
80 and almost universally trusted. Accepts 56-,
81 112- and 168-bit keys. (56 bits gives you
82 single DES at a third of the speed.) Again,
83 parity may be included or not, so the full range
84 of key sizes in bytes is: 7, 8, 14, 16, 21 or
85 24.
86
87 RC5 Arbitrary-sized key. Designed by Ron Rivest.
88 Not completely convincing in security. About as
89 fast as Blowfish, but with a quicker key
90 schedule. Patented, I think.
91
92
93Block cipher modes
94
95 There are four block cipher modes defined, all of which create a
96 useful cipher from block cipher. Modes are implemented
97 separately from ciphers, so it's easy to add either, and easy to
98 apply modes to ciphers.
99
100 A few definitions will be helpful to explain the modes. Let E
101 denote the encryption function, P be the current plaintext
102 block, C be the current ciphertext, and C' be the previous
103 ciphertext block. Let `XOR' denote the bitwise exclusive or
104 operation.
105
106 Then the modes Electronic Code Book (ECB), Cipher Block Chaining
107 (CBC) and Ciphertext Feedback (CFB) are defined as:
108
109 ECB C = E(P)
110 CBC C = E(P XOR C')
111 CFB C = P XOR E(C')
112
113 Finally, Output Feedback is defined like this: let O be the
114 current output, and O' be the previous output. Then
115
116 OFB O = E(O'), C = P XOR O
117
118 The `previous ciphertext' or `previous output' for the first
119 block is provided by an `initialization vector' or IV.
120
121 The above definitions imply that only data which comes in
122 multiples of the block size can be encrypted. Normally this is
123 the case. However, Catacomb implements all four modes so that
124 almost arbitrary sizes of plaintext can be encrypted (without
125 having to pad out the ciphertext). The details are complicated:
126 read the source, or look up `ciphertext stealing' in your copy
127 of Schneier's `Applied Cryptography'.
128
129 ECB must have *at least* one entire block to work with, but
130 apart from that can cope with odd-size inputs. Both ECB and CBC
131 insert `boundaries' when you encrypt an odd-size input -- you
132 must decrypt in exactly the same-size chunks as you encrypted,
133 otherwise you'll only get rubbish out.
134
135 CFB and OFB have no restrictions on input sizes, and do not
136 normally insert boundaries, although it's possible to explicitly
137 request one.
138
139 Be especially careful with OFB mode. Since it generates an
140 output stream independent of the plaintext, and then XORs the
141 two, if you ever reuse the same key and IV pair, both encrypted
142 messages are compromised. (Take the two ciphertexts, and XOR
143 them together -- then the OFB stream cancels and you have the
144 plaintexts XORed. This is fairly trivial to unravel.)
145
146 OFB mode makes a good random byte generator. See README.random
147 for details about random number generators in Catacomb.
148
149
150 The modes all have similar interfaces. CFB is probably the best
151 example, although CBC is more useful in practice. I'll take
152 Blowfish as my example cipher again.
153
154 You need to initialize a context block. For Blowfish in CFB
155 mode, this is called `blowfish_cfbctx'. You initialize it by
156 passing the context address, a key, the key length, and pointer
157 to an IV (which must be BLOWFISH_BLKSZ in length) to
158 blowfish_cfbinit. If you pass a null pointer instead of an IV,
159 a zero IV is used. This is usually OK for CBC, but bad for OFB
160 or CFB unless you make sure that the key itself is only used
161 once.
162
163 Data is encrypted using blowfish_cfbencrypt and
164 blowfish_cfbdecrypt -- both are given: the address of the
165 context, a pointer to the source data, a pointer to the
166 destination (which may overlap the source) and the size of the
167 data to encrypt or decrypt.
168
169 The IV may be changed by calling blowfish_cfbsetiv. The current
170 IV (really meaning the previous ciphertext) can be obtained with
171 blowfish_cfbgetiv. The key may be changed without altering the
172 IV using blowfish_cfbsetkey. A boundary may be inserted in the
173 ciphertext or plaintext using blowfish_cfbbdry.
174
175 ECB doesn't use IVs, so there aren't ecbsetiv or ecbgetiv
176 calls. You can't insert boundaries in ECB or CBC mode.
177
178 OFB encryption and decryption are the same, so there's no
179 separate ofbdecrypt call. However, ofbencrypt has some useful
180 tricks:
181
182 * If the destination pointer is null, it just churns the output
183 round for a while, without emitting any data.
184
185 * If the source pointer is null, it simply spits out the
186 output blocks from the feedback process. This is equivalent
187 to giving an input full of zero bytes.
188
189
190Implementing new modes: nasty macros
191
192 Block cipher modes are implemented as macros which define the
193 appropriate functions. They're given the prefixes (upper- and
194 lowercase) and expected to get on with life.
195
196 Data can be shunted around fairly efficiently using the BLKC
197 macros. These are fairly ugly, so don't try to work out how
198 they work.
199
200 In the following notation, `b' denotes a pointer to bytes, and
201 `w' and `wx' denote pointers to words. `PRE' is the uppercase
202 cipher prefix. I'll abuse this notation a little and use the
203 names to refer to the entire arrays, since their lengths are
204 known to be PRE_BLKSZ (in bytes) or PRE_BLKSZ / 4 (in words)
205 long.
206
207 BLKC_STORE(PRE, b, w) Set b = w
208 BLKC_XSTORE(PRE, b, w, wx) Set b = w XOR wx
209 BLKC_LOAD(PRE, w, b) Set w = b
210 BLKC_XLOAD(PRE, w, b) Set w = w XOR b
211 BLKC_MOVE(PRE, w, wx) Set w = wx
212 BLKC_XMOVE(PRE, w, wx) Set w = w XOR wx
213
214 These should be enough for most purposes. More can be added,
215 but involves a strong stomach and an ability to do things with C
216 macros which most people wouldn't like to think about over
217 dinner.
218
219
220Other ciphers
221
222 There's only one stream cipher implemented at the moment, and
223 that's RC4. It was designed by Ron Rivest. It's the fastest
224 cipher in Catacomb. It looks fairly strong (although see the
225 note about churning the context after keying below). And also
226 note that it works in output feedback -- you just XOR the output
227 from RC4 with the plaintext. Never reuse an RC4 key!
228
229 RC4 includes an OFB-like interface which should be familiar. It
230 also includes a pair of strange macros RC4_OPEN and RC4_BYTE.
231 These are used to actually get bytes out of the RC4 generator.
232
233 RC4_OPEN is really a new syntactic form. If `r' is a pointer to
234 an RC4 context, then
235
236 RC4_OPEN(r, <statements>);
237
238 executes <statements> within the opened RC4 context. The
239 significance of this is that the expression RC4_BYTE(x) extracts
240 the next byte from the innermost open context, and stores it in
241 x. The standard RC4 encrypt function is written in terms of
242 RC4_OPEN and RC4_BYTE.
243
244 RC4 makes an excellent and fast random-byte generator.
245
246 RSA Data Security Inc. claim that RC4 is a trade secret of
247 theirs. It doesn't look very secret to me.
248
249
250Generic cipher interfaces
251
252 It can be convenient to implement routines where the cipher to
253 use is a parameter. Hence, Catacomb provides a generic
254 interface to (symmetric) ciphers. The generic interface is
255 defined in <catacomb/gcipher.h>.
256
257 The basic type in the interface is `gcipher', which represents
258 an `instance' of a cipher. You don't see lone cipher objects,
259 only pointers to them, so really everything's in terms of
260 `gcipher *'.
261
262 A `gcipher' is a structured type with one member, called `ops'
263 which points to a collection of functions and other useful
264 information. If `c' is a cipher...
265
266 c->ops->b->name Name of the cipher being used
267
268 c->ops->b->keysz Key size in bytes (or zero for
269 `don't care')
270
271 c->ops->b->blksz Block size in bytes (or zero for
272 `not a block cipher')
273
274 c->ops->encrypt(c, s, t, sz) Encrypt the sz bytes stored in
275 s, and store the ciphertext at
276 t.
277
278 c->ops->decrypt(c, s, t, sz) Like encrypt, only it decrypts.
279
280 c->ops->destroy(c) Destroys the cipher object `r'.
281
282 c->ops->setiv(c, iv) Sets the IV to be `iv' -- must
283 be blksz bytes long.
284
285 c->ops->bdry(c) Inserts a boundary.
286
287 Note that `setiv' and `bdry' aren't implemented by all ciphers
288 so these may be null pointers. It's best to check first.
289
290 Generic cipher instances are created from `generic cipher
291 classes' (type `gccipher' -- note the extra `c'). This contains
292 two members:
293
294 b The `class base' -- this is the object pointed
295 to by `c->ops->b', and contains `name', `keysz'
296 and `blksz' members.
297
298 init The constructor. You give it a pointer to some
299 key data and the key size, and it returns a
300 generic cipher instance.
301
302 Note that new generic ciphers always have zero IVs (if they
303 understand the concept), so you may need to call setiv if you
304 want to reuse keys.
305
306 Always remember to destroy gcipher instances when you're
307 finished with them.
308
309 The generic cipher class for CBC-mode Blowfish is called
310 `blowfish_cbc' -- the others are named similarly. The RC4
311 generic cipher class is called simply `rc4'.
312
313
314--
315[mdw]
316
317\f
318Local variables:
319mode: text
320End: