[u/mdw/catacomb] / README.cipher

Symmetric ciphers


	Catacomb provides a number of symmetric ciphers, together with a
	generic cipher interface.  More ciphers may be added later.


Block cipher interface

	There are a number of block ciphers implemented, all with
	extremely similar interfaces.  However, block ciphers aren't
	actually at all pleasant to use directly.  They're really
	intended to be used only by higher-level `modes'.  

	Anyway, I'll take Bruce Schneier's Blowfish as an example.

	Before doing any encryption or decryption, you need to
	initialize a `context'.  The context data is stored in an object
	of type `blowfish_ctx'.  You initialize the context by calling
	`blowfish_init' with the address of the context, the address of
	some key data, and the length of the key.

	Data is encrypted using `blowfish_eblk' and decrypted by
	`blowfish_dblk'.  Both functions are given data as an array of
	`uint32' words.  (Since Blowfish uses 64-bit blocks, you give it
	arrays of two words.)

	A number of constants are defined to describe further properties
	of the cipher:

	BLOWFISH_KEYSZ	Is zero, to indicate that Blowfish doesn't care
			much about the size of key you give it.

	BLOWFISH_BLKSZ	Is 8, because Blowfish works on 64-bit blocks,
			which are therefore 8 bytes wide.

	BLOWFISH_CLASS	Is the triple (N, B, 64).  This is explained
			below.

	The BLOWFISH_CLASS macro contains information useful to other
	macros, rather than to direct users of the interface.  The three
	components are:

	The `type'	Simply N if specific macros for handling blocks
			of the appropriate width have been written, or X
			if the macros should use a loop instead.
			
	The `endianness'
			Either `B' for big-endian, or L for little-
			endian.
			
	The `width'	The cipher's block size in bits.

	This simple interface is thoroughly inconvenient for general
	use, although it makes writing block ciphers very easy.


	The peculiarities of the various ciphers are described below.

	Blowfish	Fairly standard, really.  Accepts arbitrary-
			sized keys up to 448 bits.  (The original
			definition only specified keys with a multiple
			of 32 bits -- the extension I use is due, I
			think, to Eric Young.)  Blowfish is fast and
			looks very secure.

	IDEA		Requires a 128-bit key.  Not very fast.  No
			known attacks on the full cipher.  Used in
			PGP2.  Patented!

	DES		Good old reliable.  Been around for donkey's
			years and still going.  Single-DES (implemented
			here) has a small key, but apart from that is
			looking remarkably robust.  Uses a 56-bit key
			which may be either 8 bytes with (ignored)
			parity bits in the bottom bit of each byte, or 7
			bytes with no parity.

	DES3		Two- or three-key triple DES.  Slow, but strong
			and almost universally trusted.  Accepts 56-,
			112- and 168-bit keys.  (56 bits gives you
			single DES at a third of the speed.)  Again,
			parity may be included or not, so the full range
			of key sizes in bytes is: 7, 8, 14, 16, 21 or
			24.

	RC5		Arbitrary-sized key.  Designed by Ron Rivest.
			Not completely convincing in security.  About as
			fast as Blowfish, but with a quicker key
			schedule.  Patented, I think.


Block cipher modes

	There are four block cipher modes defined, all of which create a
	useful cipher from block cipher.  Modes are implemented
	separately from ciphers, so it's easy to add either, and easy to
	apply modes to ciphers.

	A few definitions will be helpful to explain the modes.  Let E
	denote the encryption function, P be the current plaintext
	block, C be the current ciphertext, and C' be the previous
	ciphertext block.  Let `XOR' denote the bitwise exclusive or
	operation.

	Then the modes Electronic Code Book (ECB), Cipher Block Chaining
	(CBC) and Ciphertext Feedback (CFB) are defined as:

	ECB		C = E(P)
	CBC		C = E(P XOR C')
	CFB		C = P XOR E(C')

	Finally, Output Feedback is defined like this: let O be the
	current output, and O' be the previous output.  Then

	OFB		O = E(O'), C = P XOR O

	The `previous ciphertext' or `previous output' for the first
	block is provided by an `initialization vector' or IV.

	The above definitions imply that only data which comes in
	multiples of the block size can be encrypted.  Normally this is
	the case.  However, Catacomb implements all four modes so that
	almost arbitrary sizes of plaintext can be encrypted (without
	having to pad out the ciphertext).  The details are complicated:
	read the source, or look up `ciphertext stealing' in your copy
	of Schneier's `Applied Cryptography'.

	ECB must have *at least* one entire block to work with, but
	apart from that can cope with odd-size inputs.  Both ECB and CBC
	insert `boundaries' when you encrypt an odd-size input -- you
	must decrypt in exactly the same-size chunks as you encrypted,
	otherwise you'll only get rubbish out.

	CFB and OFB have no restrictions on input sizes, and do not
	normally insert boundaries, although it's possible to explicitly
	request one.

	Be especially careful with OFB mode.  Since it generates an
	output stream independent of the plaintext, and then XORs the
	two, if you ever reuse the same key and IV pair, both encrypted
	messages are compromised.  (Take the two ciphertexts, and XOR
	them together -- then the OFB stream cancels and you have the
	plaintexts XORed.  This is fairly trivial to unravel.)

	OFB mode makes a good random byte generator.  See README.random
	for details about random number generators in Catacomb.


	The modes all have similar interfaces.  CFB is probably the best
	example, although CBC is more useful in practice.  I'll take
	Blowfish as my example cipher again.

	You need to initialize a context block.  For Blowfish in CFB
	mode, this is called `blowfish_cfbctx'.  You initialize it by
	passing the context address, a key, the key length, and pointer
	to an IV (which must be BLOWFISH_BLKSZ in length) to
	blowfish_cfbinit.  If you pass a null pointer instead of an IV,
	a zero IV is used.  This is usually OK for CBC, but bad for OFB
	or CFB unless you make sure that the key itself is only used
	once.

	Data is encrypted using blowfish_cfbencrypt and
	blowfish_cfbdecrypt -- both are given: the address of the
	context, a pointer to the source data, a pointer to the
	destination (which may overlap the source) and the size of the
	data to encrypt or decrypt.

	The IV may be changed by calling blowfish_cfbsetiv.  The current
	IV (really meaning the previous ciphertext) can be obtained with
	blowfish_cfbgetiv.  The key may be changed without altering the
	IV using blowfish_cfbsetkey.  A boundary may be inserted in the
	ciphertext or plaintext using blowfish_cfbbdry.

	ECB doesn't use IVs, so there aren't ecbsetiv or ecbgetiv
	calls.  You can't insert boundaries in ECB or CBC mode.

	OFB encryption and decryption are the same, so there's no
	separate ofbdecrypt call.  However, ofbencrypt has some useful
	tricks:

	  * If the destination pointer is null, it just churns the output
	    round for a while, without emitting any data.

	  * If the source pointer is null, it simply spits out the
	    output blocks from the feedback process.  This is equivalent
	    to giving an input full of zero bytes.


Implementing new modes: nasty macros

	Block cipher modes are implemented as macros which define the
	appropriate functions.  They're given the prefixes (upper- and
	lowercase) and expected to get on with life.

	Data can be shunted around fairly efficiently using the BLKC
	macros.  These are fairly ugly, so don't try to work out how
	they work.

	In the following notation, `b' denotes a pointer to bytes, and
	`w' and `wx' denote pointers to words.  `PRE' is the uppercase
	cipher prefix.  I'll abuse this notation a little and use the
	names to refer to the entire arrays, since their lengths are
	known to be PRE_BLKSZ (in bytes) or PRE_BLKSZ / 4 (in words)
	long.

	BLKC_STORE(PRE, b, w)		Set b = w
	BLKC_XSTORE(PRE, b, w, wx)	Set b = w XOR wx
	BLKC_LOAD(PRE, w, b)		Set w = b
	BLKC_XLOAD(PRE, w, b)		Set w = w XOR b
	BLKC_MOVE(PRE, w, wx)		Set w = wx
	BLKC_XMOVE(PRE, w, wx)		Set w = w XOR wx

	These should be enough for most purposes.  More can be added,
	but involves a strong stomach and an ability to do things with C
	macros which most people wouldn't like to think about over
	dinner.


Other ciphers

	There's only one stream cipher implemented at the moment, and
	that's RC4.  It was designed by Ron Rivest.  It's the fastest
	cipher in Catacomb.  It looks fairly strong (although see the
	note about churning the context after keying below).  And also
	note that it works in output feedback -- you just XOR the output
	from RC4 with the plaintext.  Never reuse an RC4 key!

	RC4 includes an OFB-like interface which should be familiar.  It
	also includes a pair of strange macros RC4_OPEN and RC4_BYTE.
	These are used to actually get bytes out of the RC4 generator.

	RC4_OPEN is really a new syntactic form.  If `r' is a pointer to
	an RC4 context, then

		RC4_OPEN(r, <statements>);

	executes <statements> within the opened RC4 context.  The
	significance of this is that the expression RC4_BYTE(x) extracts
	the next byte from the innermost open context, and stores it in
	x.  The standard RC4 encrypt function is written in terms of
	RC4_OPEN and RC4_BYTE.

	RC4 makes an excellent and fast random-byte generator.

	RSA Data Security Inc. claim that RC4 is a trade secret of
	theirs.  It doesn't look very secret to me.


Generic cipher interfaces

	It can be convenient to implement routines where the cipher to
	use is a parameter.  Hence, Catacomb provides a generic
	interface to (symmetric) ciphers.  The generic interface is
	defined in <catacomb/gcipher.h>.

	The basic type in the interface is `gcipher', which represents
	an `instance' of a cipher.  You don't see lone cipher objects,
	only pointers to them, so really everything's in terms of
	`gcipher *'.

	A `gcipher' is a structured type with one member, called `ops'
	which points to a collection of functions and other useful
	information.  If `c' is a cipher...

	c->ops->b->name			Name of the cipher being used

	c->ops->b->keysz		Key size in bytes (or zero for
					`don't care')

	c->ops->b->blksz		Block size in bytes (or zero for
					`not a block cipher')

	c->ops->encrypt(c, s, t, sz)	Encrypt the sz bytes stored in
					s, and store the ciphertext at
					t.

	c->ops->decrypt(c, s, t, sz)	Like encrypt, only it decrypts.

	c->ops->destroy(c)		Destroys the cipher object `r'.

	c->ops->setiv(c, iv)		Sets the IV to be `iv' -- must
					be blksz bytes long.

	c->ops->bdry(c)			Inserts a boundary.

	Note that `setiv' and `bdry' aren't implemented by all ciphers
	so these may be null pointers.  It's best to check first.

	Generic cipher instances are created from `generic cipher
	classes' (type `gccipher' -- note the extra `c').  This contains
	two members:

	b		The `class base' -- this is the object pointed
			to by `c->ops->b', and contains `name', `keysz'
			and `blksz' members.

	init		The constructor.  You give it a pointer to some
			key data and the key size, and it returns a
			generic cipher instance.

	Note that new generic ciphers always have zero IVs (if they
	understand the concept), so you may need to call setiv if you
	want to reuse keys.

	Always remember to destroy gcipher instances when you're
	finished with them.

	The generic cipher class for CBC-mode Blowfish is called
	`blowfish_cbc' -- the others are named similarly.  The RC4
	generic cipher class is called simply `rc4'.


--
[mdw]

\f
Local variables:
mode: text
End:
Commit	Line	Data
759513d1	1	Symmetric ciphers
	2
	3
	4	Catacomb provides a number of symmetric ciphers, together with a
	5	generic cipher interface. More ciphers may be added later.
	6
	7
	8	Block cipher interface
	9
	10	There are a number of block ciphers implemented, all with
	11	extremely similar interfaces. However, block ciphers aren't
	12	actually at all pleasant to use directly. They're really
	13	intended to be used only by higher-level `modes'.
	14
	15	Anyway, I'll take Bruce Schneier's Blowfish as an example.
	16
	17	Before doing any encryption or decryption, you need to
	18	initialize a `context'. The context data is stored in an object
	19	of type `blowfish_ctx'. You initialize the context by calling
	20	`blowfish_init' with the address of the context, the address of
	21	some key data, and the length of the key.
	22
	23	Data is encrypted using `blowfish_eblk' and decrypted by
	24	`blowfish_dblk'. Both functions are given data as an array of
	25	`uint32' words. (Since Blowfish uses 64-bit blocks, you give it
	26	arrays of two words.)
	27
	28	A number of constants are defined to describe further properties
	29	of the cipher:
	30
	31	BLOWFISH_KEYSZ Is zero, to indicate that Blowfish doesn't care
	32	much about the size of key you give it.
	33
	34	BLOWFISH_BLKSZ Is 8, because Blowfish works on 64-bit blocks,
	35	which are therefore 8 bytes wide.
	36
	37	BLOWFISH_CLASS Is the triple (N, B, 64). This is explained
	38	below.
	39
	40	The BLOWFISH_CLASS macro contains information useful to other
	41	macros, rather than to direct users of the interface. The three
	42	components are:
	43
	44	The `type' Simply N if specific macros for handling blocks
	45	of the appropriate width have been written, or X
	46	if the macros should use a loop instead.
	47
	48	The `endianness'
	49	Either `B' for big-endian, or L for little-
	50	endian.
	51
	52	The `width' The cipher's block size in bits.
	53
	54	This simple interface is thoroughly inconvenient for general
	55	use, although it makes writing block ciphers very easy.
	56
	57
	58	The peculiarities of the various ciphers are described below.
	59
	60	Blowfish Fairly standard, really. Accepts arbitrary-
	61	sized keys up to 448 bits. (The original
	62	definition only specified keys with a multiple
	63	of 32 bits -- the extension I use is due, I
	64	think, to Eric Young.) Blowfish is fast and
65	looks very secure.
66
67	IDEA Requires a 128-bit key. Not very fast. No
68	known attacks on the full cipher. Used in
69	PGP2. Patented!
70
71	DES Good old reliable. Been around for donkey's
72	years and still going. Single-DES (implemented
73	here) has a small key, but apart from that is
74	looking remarkably robust. Uses a 56-bit key
75	which may be either 8 bytes with (ignored)
76	parity bits in the bottom bit of each byte, or 7
77	bytes with no parity.
78
79	DES3 Two- or three-key triple DES. Slow, but strong
80	and almost universally trusted. Accepts 56-,
81	112- and 168-bit keys. (56 bits gives you
82	single DES at a third of the speed.) Again,
83	parity may be included or not, so the full range
84	of key sizes in bytes is: 7, 8, 14, 16, 21 or
85	24.
86
87	RC5 Arbitrary-sized key. Designed by Ron Rivest.
88	Not completely convincing in security. About as
89	fast as Blowfish, but with a quicker key
90	schedule. Patented, I think.
91
92
93	Block cipher modes
94
95	There are four block cipher modes defined, all of which create a
96	useful cipher from block cipher. Modes are implemented
97	separately from ciphers, so it's easy to add either, and easy to
98	apply modes to ciphers.
99
100	A few definitions will be helpful to explain the modes. Let E
101	denote the encryption function, P be the current plaintext
102	block, C be the current ciphertext, and C' be the previous
103	ciphertext block. Let `XOR' denote the bitwise exclusive or
104	operation.
105
106	Then the modes Electronic Code Book (ECB), Cipher Block Chaining
107	(CBC) and Ciphertext Feedback (CFB) are defined as:
108
109	ECB C = E(P)
110	CBC C = E(P XOR C')
111	CFB C = P XOR E(C')
112
113	Finally, Output Feedback is defined like this: let O be the
114	current output, and O' be the previous output. Then
115
116	OFB O = E(O'), C = P XOR O
117
118	The `previous ciphertext' or `previous output' for the first
119	block is provided by an `initialization vector' or IV.
120
121	The above definitions imply that only data which comes in
122	multiples of the block size can be encrypted. Normally this is
123	the case. However, Catacomb implements all four modes so that
124	almost arbitrary sizes of plaintext can be encrypted (without
125	having to pad out the ciphertext). The details are complicated:
126	read the source, or look up `ciphertext stealing' in your copy
127	of Schneier's `Applied Cryptography'.
128
129	ECB must have at least one entire block to work with, but
130	apart from that can cope with odd-size inputs. Both ECB and CBC
131	insert `boundaries' when you encrypt an odd-size input -- you
132	must decrypt in exactly the same-size chunks as you encrypted,
133	otherwise you'll only get rubbish out.
134
135	CFB and OFB have no restrictions on input sizes, and do not
136	normally insert boundaries, although it's possible to explicitly
137	request one.
138
139	Be especially careful with OFB mode. Since it generates an
140	output stream independent of the plaintext, and then XORs the
141	two, if you ever reuse the same key and IV pair, both encrypted
142	messages are compromised. (Take the two ciphertexts, and XOR
143	them together -- then the OFB stream cancels and you have the
144	plaintexts XORed. This is fairly trivial to unravel.)
145
146	OFB mode makes a good random byte generator. See README.random
147	for details about random number generators in Catacomb.
148
149
150	The modes all have similar interfaces. CFB is probably the best
151	example, although CBC is more useful in practice. I'll take
152	Blowfish as my example cipher again.
153
154	You need to initialize a context block. For Blowfish in CFB
155	mode, this is called `blowfish_cfbctx'. You initialize it by
156	passing the context address, a key, the key length, and pointer
157	to an IV (which must be BLOWFISH_BLKSZ in length) to
158	blowfish_cfbinit. If you pass a null pointer instead of an IV,
159	a zero IV is used. This is usually OK for CBC, but bad for OFB
160	or CFB unless you make sure that the key itself is only used
161	once.
162
163	Data is encrypted using blowfish_cfbencrypt and
164	blowfish_cfbdecrypt -- both are given: the address of the
165	context, a pointer to the source data, a pointer to the
166	destination (which may overlap the source) and the size of the
167	data to encrypt or decrypt.
168
169	The IV may be changed by calling blowfish_cfbsetiv. The current
170	IV (really meaning the previous ciphertext) can be obtained with
171	blowfish_cfbgetiv. The key may be changed without altering the
172	IV using blowfish_cfbsetkey. A boundary may be inserted in the
173	ciphertext or plaintext using blowfish_cfbbdry.
174
175	ECB doesn't use IVs, so there aren't ecbsetiv or ecbgetiv
176	calls. You can't insert boundaries in ECB or CBC mode.
177
178	OFB encryption and decryption are the same, so there's no
179	separate ofbdecrypt call. However, ofbencrypt has some useful
180	tricks:
181
182	* If the destination pointer is null, it just churns the output
183	round for a while, without emitting any data.
184
185	* If the source pointer is null, it simply spits out the
186	output blocks from the feedback process. This is equivalent
187	to giving an input full of zero bytes.
188
189
190	Implementing new modes: nasty macros
191
192	Block cipher modes are implemented as macros which define the
193	appropriate functions. They're given the prefixes (upper- and
194	lowercase) and expected to get on with life.
195
196	Data can be shunted around fairly efficiently using the BLKC
197	macros. These are fairly ugly, so don't try to work out how
198	they work.
199
200	In the following notation, `b' denotes a pointer to bytes, and
201	`w' and `wx' denote pointers to words. `PRE' is the uppercase
202	cipher prefix. I'll abuse this notation a little and use the
203	names to refer to the entire arrays, since their lengths are
204	known to be PRE_BLKSZ (in bytes) or PRE_BLKSZ / 4 (in words)
205	long.
206
207	BLKC_STORE(PRE, b, w) Set b = w
208	BLKC_XSTORE(PRE, b, w, wx) Set b = w XOR wx
209	BLKC_LOAD(PRE, w, b) Set w = b
210	BLKC_XLOAD(PRE, w, b) Set w = w XOR b
211	BLKC_MOVE(PRE, w, wx) Set w = wx
212	BLKC_XMOVE(PRE, w, wx) Set w = w XOR wx
213
214	These should be enough for most purposes. More can be added,
215	but involves a strong stomach and an ability to do things with C
216	macros which most people wouldn't like to think about over
217	dinner.
218
219
220	Other ciphers
221
222	There's only one stream cipher implemented at the moment, and
223	that's RC4. It was designed by Ron Rivest. It's the fastest
224	cipher in Catacomb. It looks fairly strong (although see the
225	note about churning the context after keying below). And also
226	note that it works in output feedback -- you just XOR the output
227	from RC4 with the plaintext. Never reuse an RC4 key!
228
229	RC4 includes an OFB-like interface which should be familiar. It
230	also includes a pair of strange macros RC4_OPEN and RC4_BYTE.
231	These are used to actually get bytes out of the RC4 generator.
232
233	RC4_OPEN is really a new syntactic form. If `r' is a pointer to
234	an RC4 context, then
235
236	RC4_OPEN(r, <statements>);
237
238	executes <statements> within the opened RC4 context. The
239	significance of this is that the expression RC4_BYTE(x) extracts
240	the next byte from the innermost open context, and stores it in
241	x. The standard RC4 encrypt function is written in terms of
242	RC4_OPEN and RC4_BYTE.
243
244	RC4 makes an excellent and fast random-byte generator.
245
246	RSA Data Security Inc. claim that RC4 is a trade secret of
247	theirs. It doesn't look very secret to me.
248
249
250	Generic cipher interfaces
251
252	It can be convenient to implement routines where the cipher to
253	use is a parameter. Hence, Catacomb provides a generic
254	interface to (symmetric) ciphers. The generic interface is
255	defined in <catacomb/gcipher.h>.
256
257	The basic type in the interface is `gcipher', which represents
258	an `instance' of a cipher. You don't see lone cipher objects,
259	only pointers to them, so really everything's in terms of
260	`gcipher *'.
261
262	A `gcipher' is a structured type with one member, called `ops'
263	which points to a collection of functions and other useful
264	information. If `c' is a cipher...
265
266	c->ops->b->name Name of the cipher being used
267
268	c->ops->b->keysz Key size in bytes (or zero for
269	`don't care')
270
271	c->ops->b->blksz Block size in bytes (or zero for
272	`not a block cipher')
273
274	c->ops->encrypt(c, s, t, sz) Encrypt the sz bytes stored in
275	s, and store the ciphertext at
276	t.
277
278	c->ops->decrypt(c, s, t, sz) Like encrypt, only it decrypts.
279
280	c->ops->destroy(c) Destroys the cipher object `r'.
281
282	c->ops->setiv(c, iv) Sets the IV to be `iv' -- must
283	be blksz bytes long.
284
285	c->ops->bdry(c) Inserts a boundary.
286
287	Note that `setiv' and `bdry' aren't implemented by all ciphers
288	so these may be null pointers. It's best to check first.
289
290	Generic cipher instances are created from `generic cipher
291	classes' (type `gccipher' -- note the extra `c'). This contains
292	two members:
293
294	b The `class base' -- this is the object pointed
295	to by `c->ops->b', and contains `name', `keysz'
296	and `blksz' members.
297
298	init The constructor. You give it a pointer to some
299	key data and the key size, and it returns a
300	generic cipher instance.
301
302	Note that new generic ciphers always have zero IVs (if they
303	understand the concept), so you may need to call setiv if you
304	want to reuse keys.
305
306	Always remember to destroy gcipher instances when you're
307	finished with them.
308
309	The generic cipher class for CBC-mode Blowfish is called
310	`blowfish_cbc' -- the others are named similarly. The RC4
311	generic cipher class is called simply `rc4'.
312
313
314	--
315	[mdw]
316
317	\f
318	Local variables:
319	mode: text
320	End: