X-Git-Url: https://git.distorted.org.uk/u/mdw/catacomb/blobdiff_plain/674cd11ec63b56561980249cb19a0db54bacfa86..759513d1c2f79d94abe726682f43a28f363229cc:/README.cipher

diff --git a/README.cipher b/README.cipher
new file mode 100644
index 0000000..f9d08b9
--- /dev/null
+++ b/README.cipher
@@ -0,0 +1,320 @@
+Symmetric ciphers
+
+
+	Catacomb provides a number of symmetric ciphers, together with a
+	generic cipher interface.  More ciphers may be added later.
+
+
+Block cipher interface
+
+	There are a number of block ciphers implemented, all with
+	extremely similar interfaces.  However, block ciphers aren't
+	actually at all pleasant to use directly.  They're really
+	intended to be used only by higher-level `modes'.  
+
+	Anyway, I'll take Bruce Schneier's Blowfish as an example.
+
+	Before doing any encryption or decryption, you need to
+	initialize a `context'.  The context data is stored in an object
+	of type `blowfish_ctx'.  You initialize the context by calling
+	`blowfish_init' with the address of the context, the address of
+	some key data, and the length of the key.
+
+	Data is encrypted using `blowfish_eblk' and decrypted by
+	`blowfish_dblk'.  Both functions are given data as an array of
+	`uint32' words.  (Since Blowfish uses 64-bit blocks, you give it
+	arrays of two words.)
+
+	A number of constants are defined to describe further properties
+	of the cipher:
+
+	BLOWFISH_KEYSZ	Is zero, to indicate that Blowfish doesn't care
+			much about the size of key you give it.
+
+	BLOWFISH_BLKSZ	Is 8, because Blowfish works on 64-bit blocks,
+			which are therefore 8 bytes wide.
+
+	BLOWFISH_CLASS	Is the triple (N, B, 64).  This is explained
+			below.
+
+	The BLOWFISH_CLASS macro contains information useful to other
+	macros, rather than to direct users of the interface.  The three
+	components are:
+
+	The `type'	Simply N if specific macros for handling blocks
+			of the appropriate width have been written, or X
+			if the macros should use a loop instead.
+			
+	The `endianness'
+			Either `B' for big-endian, or L for little-
+			endian.
+			
+	The `width'	The cipher's block size in bits.
+
+	This simple interface is thoroughly inconvenient for general
+	use, although it makes writing block ciphers very easy.
+
+
+	The peculiarities of the various ciphers are described below.
+
+	Blowfish	Fairly standard, really.  Accepts arbitrary-
+			sized keys up to 448 bits.  (The original
+			definition only specified keys with a multiple
+			of 32 bits -- the extension I use is due, I
+			think, to Eric Young.)  Blowfish is fast and
+			looks very secure.
+
+	IDEA		Requires a 128-bit key.  Not very fast.  No
+			known attacks on the full cipher.  Used in
+			PGP2.  Patented!
+
+	DES		Good old reliable.  Been around for donkey's
+			years and still going.  Single-DES (implemented
+			here) has a small key, but apart from that is
+			looking remarkably robust.  Uses a 56-bit key
+			which may be either 8 bytes with (ignored)
+			parity bits in the bottom bit of each byte, or 7
+			bytes with no parity.
+
+	DES3		Two- or three-key triple DES.  Slow, but strong
+			and almost universally trusted.  Accepts 56-,
+			112- and 168-bit keys.  (56 bits gives you
+			single DES at a third of the speed.)  Again,
+			parity may be included or not, so the full range
+			of key sizes in bytes is: 7, 8, 14, 16, 21 or
+			24.
+
+	RC5		Arbitrary-sized key.  Designed by Ron Rivest.
+			Not completely convincing in security.  About as
+			fast as Blowfish, but with a quicker key
+			schedule.  Patented, I think.
+
+
+Block cipher modes
+
+	There are four block cipher modes defined, all of which create a
+	useful cipher from block cipher.  Modes are implemented
+	separately from ciphers, so it's easy to add either, and easy to
+	apply modes to ciphers.
+
+	A few definitions will be helpful to explain the modes.  Let E
+	denote the encryption function, P be the current plaintext
+	block, C be the current ciphertext, and C' be the previous
+	ciphertext block.  Let `XOR' denote the bitwise exclusive or
+	operation.
+
+	Then the modes Electronic Code Book (ECB), Cipher Block Chaining
+	(CBC) and Ciphertext Feedback (CFB) are defined as:
+
+	ECB		C = E(P)
+	CBC		C = E(P XOR C')
+	CFB		C = P XOR E(C')
+
+	Finally, Output Feedback is defined like this: let O be the
+	current output, and O' be the previous output.  Then
+
+	OFB		O = E(O'), C = P XOR O
+
+	The `previous ciphertext' or `previous output' for the first
+	block is provided by an `initialization vector' or IV.
+
+	The above definitions imply that only data which comes in
+	multiples of the block size can be encrypted.  Normally this is
+	the case.  However, Catacomb implements all four modes so that
+	almost arbitrary sizes of plaintext can be encrypted (without
+	having to pad out the ciphertext).  The details are complicated:
+	read the source, or look up `ciphertext stealing' in your copy
+	of Schneier's `Applied Cryptography'.
+
+	ECB must have *at least* one entire block to work with, but
+	apart from that can cope with odd-size inputs.  Both ECB and CBC
+	insert `boundaries' when you encrypt an odd-size input -- you
+	must decrypt in exactly the same-size chunks as you encrypted,
+	otherwise you'll only get rubbish out.
+
+	CFB and OFB have no restrictions on input sizes, and do not
+	normally insert boundaries, although it's possible to explicitly
+	request one.
+
+	Be especially careful with OFB mode.  Since it generates an
+	output stream independent of the plaintext, and then XORs the
+	two, if you ever reuse the same key and IV pair, both encrypted
+	messages are compromised.  (Take the two ciphertexts, and XOR
+	them together -- then the OFB stream cancels and you have the
+	plaintexts XORed.  This is fairly trivial to unravel.)
+
+	OFB mode makes a good random byte generator.  See README.random
+	for details about random number generators in Catacomb.
+
+
+	The modes all have similar interfaces.  CFB is probably the best
+	example, although CBC is more useful in practice.  I'll take
+	Blowfish as my example cipher again.
+
+	You need to initialize a context block.  For Blowfish in CFB
+	mode, this is called `blowfish_cfbctx'.  You initialize it by
+	passing the context address, a key, the key length, and pointer
+	to an IV (which must be BLOWFISH_BLKSZ in length) to
+	blowfish_cfbinit.  If you pass a null pointer instead of an IV,
+	a zero IV is used.  This is usually OK for CBC, but bad for OFB
+	or CFB unless you make sure that the key itself is only used
+	once.
+
+	Data is encrypted using blowfish_cfbencrypt and
+	blowfish_cfbdecrypt -- both are given: the address of the
+	context, a pointer to the source data, a pointer to the
+	destination (which may overlap the source) and the size of the
+	data to encrypt or decrypt.
+
+	The IV may be changed by calling blowfish_cfbsetiv.  The current
+	IV (really meaning the previous ciphertext) can be obtained with
+	blowfish_cfbgetiv.  The key may be changed without altering the
+	IV using blowfish_cfbsetkey.  A boundary may be inserted in the
+	ciphertext or plaintext using blowfish_cfbbdry.
+
+	ECB doesn't use IVs, so there aren't ecbsetiv or ecbgetiv
+	calls.  You can't insert boundaries in ECB or CBC mode.
+
+	OFB encryption and decryption are the same, so there's no
+	separate ofbdecrypt call.  However, ofbencrypt has some useful
+	tricks:
+
+	  * If the destination pointer is null, it just churns the output
+	    round for a while, without emitting any data.
+
+	  * If the source pointer is null, it simply spits out the
+	    output blocks from the feedback process.  This is equivalent
+	    to giving an input full of zero bytes.
+
+
+Implementing new modes: nasty macros
+
+	Block cipher modes are implemented as macros which define the
+	appropriate functions.  They're given the prefixes (upper- and
+	lowercase) and expected to get on with life.
+
+	Data can be shunted around fairly efficiently using the BLKC
+	macros.  These are fairly ugly, so don't try to work out how
+	they work.
+
+	In the following notation, `b' denotes a pointer to bytes, and
+	`w' and `wx' denote pointers to words.  `PRE' is the uppercase
+	cipher prefix.  I'll abuse this notation a little and use the
+	names to refer to the entire arrays, since their lengths are
+	known to be PRE_BLKSZ (in bytes) or PRE_BLKSZ / 4 (in words)
+	long.
+
+	BLKC_STORE(PRE, b, w)		Set b = w
+	BLKC_XSTORE(PRE, b, w, wx)	Set b = w XOR wx
+	BLKC_LOAD(PRE, w, b)		Set w = b
+	BLKC_XLOAD(PRE, w, b)		Set w = w XOR b
+	BLKC_MOVE(PRE, w, wx)		Set w = wx
+	BLKC_XMOVE(PRE, w, wx)		Set w = w XOR wx
+
+	These should be enough for most purposes.  More can be added,
+	but involves a strong stomach and an ability to do things with C
+	macros which most people wouldn't like to think about over
+	dinner.
+
+
+Other ciphers
+
+	There's only one stream cipher implemented at the moment, and
+	that's RC4.  It was designed by Ron Rivest.  It's the fastest
+	cipher in Catacomb.  It looks fairly strong (although see the
+	note about churning the context after keying below).  And also
+	note that it works in output feedback -- you just XOR the output
+	from RC4 with the plaintext.  Never reuse an RC4 key!
+
+	RC4 includes an OFB-like interface which should be familiar.  It
+	also includes a pair of strange macros RC4_OPEN and RC4_BYTE.
+	These are used to actually get bytes out of the RC4 generator.
+
+	RC4_OPEN is really a new syntactic form.  If `r' is a pointer to
+	an RC4 context, then
+
+		RC4_OPEN(r, <statements>);
+
+	executes <statements> within the opened RC4 context.  The
+	significance of this is that the expression RC4_BYTE(x) extracts
+	the next byte from the innermost open context, and stores it in
+	x.  The standard RC4 encrypt function is written in terms of
+	RC4_OPEN and RC4_BYTE.
+
+	RC4 makes an excellent and fast random-byte generator.
+
+	RSA Data Security Inc. claim that RC4 is a trade secret of
+	theirs.  It doesn't look very secret to me.
+
+
+Generic cipher interfaces
+
+	It can be convenient to implement routines where the cipher to
+	use is a parameter.  Hence, Catacomb provides a generic
+	interface to (symmetric) ciphers.  The generic interface is
+	defined in <catacomb/gcipher.h>.
+
+	The basic type in the interface is `gcipher', which represents
+	an `instance' of a cipher.  You don't see lone cipher objects,
+	only pointers to them, so really everything's in terms of
+	`gcipher *'.
+
+	A `gcipher' is a structured type with one member, called `ops'
+	which points to a collection of functions and other useful
+	information.  If `c' is a cipher...
+
+	c->ops->b->name			Name of the cipher being used
+
+	c->ops->b->keysz		Key size in bytes (or zero for
+					`don't care')
+
+	c->ops->b->blksz		Block size in bytes (or zero for
+					`not a block cipher')
+
+	c->ops->encrypt(c, s, t, sz)	Encrypt the sz bytes stored in
+					s, and store the ciphertext at
+					t.
+
+	c->ops->decrypt(c, s, t, sz)	Like encrypt, only it decrypts.
+
+	c->ops->destroy(c)		Destroys the cipher object `r'.
+
+	c->ops->setiv(c, iv)		Sets the IV to be `iv' -- must
+					be blksz bytes long.
+
+	c->ops->bdry(c)			Inserts a boundary.
+
+	Note that `setiv' and `bdry' aren't implemented by all ciphers
+	so these may be null pointers.  It's best to check first.
+
+	Generic cipher instances are created from `generic cipher
+	classes' (type `gccipher' -- note the extra `c').  This contains
+	two members:
+
+	b		The `class base' -- this is the object pointed
+			to by `c->ops->b', and contains `name', `keysz'
+			and `blksz' members.
+
+	init		The constructor.  You give it a pointer to some
+			key data and the key size, and it returns a
+			generic cipher instance.
+
+	Note that new generic ciphers always have zero IVs (if they
+	understand the concept), so you may need to call setiv if you
+	want to reuse keys.
+
+	Always remember to destroy gcipher instances when you're
+	finished with them.
+
+	The generic cipher class for CBC-mode Blowfish is called
+	`blowfish_cbc' -- the others are named similarly.  The RC4
+	generic cipher class is called simply `rc4'.
+
+
+--
+[mdw]
+
+
+Local variables:
+mode: text
+End: