--- /dev/null
+Symmetric ciphers
+
+
+ Catacomb provides a number of symmetric ciphers, together with a
+ generic cipher interface. More ciphers may be added later.
+
+
+Block cipher interface
+
+ There are a number of block ciphers implemented, all with
+ extremely similar interfaces. However, block ciphers aren't
+ actually at all pleasant to use directly. They're really
+ intended to be used only by higher-level `modes'.
+
+ Anyway, I'll take Bruce Schneier's Blowfish as an example.
+
+ Before doing any encryption or decryption, you need to
+ initialize a `context'. The context data is stored in an object
+ of type `blowfish_ctx'. You initialize the context by calling
+ `blowfish_init' with the address of the context, the address of
+ some key data, and the length of the key.
+
+ Data is encrypted using `blowfish_eblk' and decrypted by
+ `blowfish_dblk'. Both functions are given data as an array of
+ `uint32' words. (Since Blowfish uses 64-bit blocks, you give it
+ arrays of two words.)
+
+ A number of constants are defined to describe further properties
+ of the cipher:
+
+ BLOWFISH_KEYSZ Is zero, to indicate that Blowfish doesn't care
+ much about the size of key you give it.
+
+ BLOWFISH_BLKSZ Is 8, because Blowfish works on 64-bit blocks,
+ which are therefore 8 bytes wide.
+
+ BLOWFISH_CLASS Is the triple (N, B, 64). This is explained
+ below.
+
+ The BLOWFISH_CLASS macro contains information useful to other
+ macros, rather than to direct users of the interface. The three
+ components are:
+
+ The `type' Simply N if specific macros for handling blocks
+ of the appropriate width have been written, or X
+ if the macros should use a loop instead.
+
+ The `endianness'
+ Either `B' for big-endian, or L for little-
+ endian.
+
+ The `width' The cipher's block size in bits.
+
+ This simple interface is thoroughly inconvenient for general
+ use, although it makes writing block ciphers very easy.
+
+
+ The peculiarities of the various ciphers are described below.
+
+ Blowfish Fairly standard, really. Accepts arbitrary-
+ sized keys up to 448 bits. (The original
+ definition only specified keys with a multiple
+ of 32 bits -- the extension I use is due, I
+ think, to Eric Young.) Blowfish is fast and
+ looks very secure.
+
+ IDEA Requires a 128-bit key. Not very fast. No
+ known attacks on the full cipher. Used in
+ PGP2. Patented!
+
+ DES Good old reliable. Been around for donkey's
+ years and still going. Single-DES (implemented
+ here) has a small key, but apart from that is
+ looking remarkably robust. Uses a 56-bit key
+ which may be either 8 bytes with (ignored)
+ parity bits in the bottom bit of each byte, or 7
+ bytes with no parity.
+
+ DES3 Two- or three-key triple DES. Slow, but strong
+ and almost universally trusted. Accepts 56-,
+ 112- and 168-bit keys. (56 bits gives you
+ single DES at a third of the speed.) Again,
+ parity may be included or not, so the full range
+ of key sizes in bytes is: 7, 8, 14, 16, 21 or
+ 24.
+
+ RC5 Arbitrary-sized key. Designed by Ron Rivest.
+ Not completely convincing in security. About as
+ fast as Blowfish, but with a quicker key
+ schedule. Patented, I think.
+
+
+Block cipher modes
+
+ There are four block cipher modes defined, all of which create a
+ useful cipher from block cipher. Modes are implemented
+ separately from ciphers, so it's easy to add either, and easy to
+ apply modes to ciphers.
+
+ A few definitions will be helpful to explain the modes. Let E
+ denote the encryption function, P be the current plaintext
+ block, C be the current ciphertext, and C' be the previous
+ ciphertext block. Let `XOR' denote the bitwise exclusive or
+ operation.
+
+ Then the modes Electronic Code Book (ECB), Cipher Block Chaining
+ (CBC) and Ciphertext Feedback (CFB) are defined as:
+
+ ECB C = E(P)
+ CBC C = E(P XOR C')
+ CFB C = P XOR E(C')
+
+ Finally, Output Feedback is defined like this: let O be the
+ current output, and O' be the previous output. Then
+
+ OFB O = E(O'), C = P XOR O
+
+ The `previous ciphertext' or `previous output' for the first
+ block is provided by an `initialization vector' or IV.
+
+ The above definitions imply that only data which comes in
+ multiples of the block size can be encrypted. Normally this is
+ the case. However, Catacomb implements all four modes so that
+ almost arbitrary sizes of plaintext can be encrypted (without
+ having to pad out the ciphertext). The details are complicated:
+ read the source, or look up `ciphertext stealing' in your copy
+ of Schneier's `Applied Cryptography'.
+
+ ECB must have *at least* one entire block to work with, but
+ apart from that can cope with odd-size inputs. Both ECB and CBC
+ insert `boundaries' when you encrypt an odd-size input -- you
+ must decrypt in exactly the same-size chunks as you encrypted,
+ otherwise you'll only get rubbish out.
+
+ CFB and OFB have no restrictions on input sizes, and do not
+ normally insert boundaries, although it's possible to explicitly
+ request one.
+
+ Be especially careful with OFB mode. Since it generates an
+ output stream independent of the plaintext, and then XORs the
+ two, if you ever reuse the same key and IV pair, both encrypted
+ messages are compromised. (Take the two ciphertexts, and XOR
+ them together -- then the OFB stream cancels and you have the
+ plaintexts XORed. This is fairly trivial to unravel.)
+
+ OFB mode makes a good random byte generator. See README.random
+ for details about random number generators in Catacomb.
+
+
+ The modes all have similar interfaces. CFB is probably the best
+ example, although CBC is more useful in practice. I'll take
+ Blowfish as my example cipher again.
+
+ You need to initialize a context block. For Blowfish in CFB
+ mode, this is called `blowfish_cfbctx'. You initialize it by
+ passing the context address, a key, the key length, and pointer
+ to an IV (which must be BLOWFISH_BLKSZ in length) to
+ blowfish_cfbinit. If you pass a null pointer instead of an IV,
+ a zero IV is used. This is usually OK for CBC, but bad for OFB
+ or CFB unless you make sure that the key itself is only used
+ once.
+
+ Data is encrypted using blowfish_cfbencrypt and
+ blowfish_cfbdecrypt -- both are given: the address of the
+ context, a pointer to the source data, a pointer to the
+ destination (which may overlap the source) and the size of the
+ data to encrypt or decrypt.
+
+ The IV may be changed by calling blowfish_cfbsetiv. The current
+ IV (really meaning the previous ciphertext) can be obtained with
+ blowfish_cfbgetiv. The key may be changed without altering the
+ IV using blowfish_cfbsetkey. A boundary may be inserted in the
+ ciphertext or plaintext using blowfish_cfbbdry.
+
+ ECB doesn't use IVs, so there aren't ecbsetiv or ecbgetiv
+ calls. You can't insert boundaries in ECB or CBC mode.
+
+ OFB encryption and decryption are the same, so there's no
+ separate ofbdecrypt call. However, ofbencrypt has some useful
+ tricks:
+
+ * If the destination pointer is null, it just churns the output
+ round for a while, without emitting any data.
+
+ * If the source pointer is null, it simply spits out the
+ output blocks from the feedback process. This is equivalent
+ to giving an input full of zero bytes.
+
+
+Implementing new modes: nasty macros
+
+ Block cipher modes are implemented as macros which define the
+ appropriate functions. They're given the prefixes (upper- and
+ lowercase) and expected to get on with life.
+
+ Data can be shunted around fairly efficiently using the BLKC
+ macros. These are fairly ugly, so don't try to work out how
+ they work.
+
+ In the following notation, `b' denotes a pointer to bytes, and
+ `w' and `wx' denote pointers to words. `PRE' is the uppercase
+ cipher prefix. I'll abuse this notation a little and use the
+ names to refer to the entire arrays, since their lengths are
+ known to be PRE_BLKSZ (in bytes) or PRE_BLKSZ / 4 (in words)
+ long.
+
+ BLKC_STORE(PRE, b, w) Set b = w
+ BLKC_XSTORE(PRE, b, w, wx) Set b = w XOR wx
+ BLKC_LOAD(PRE, w, b) Set w = b
+ BLKC_XLOAD(PRE, w, b) Set w = w XOR b
+ BLKC_MOVE(PRE, w, wx) Set w = wx
+ BLKC_XMOVE(PRE, w, wx) Set w = w XOR wx
+
+ These should be enough for most purposes. More can be added,
+ but involves a strong stomach and an ability to do things with C
+ macros which most people wouldn't like to think about over
+ dinner.
+
+
+Other ciphers
+
+ There's only one stream cipher implemented at the moment, and
+ that's RC4. It was designed by Ron Rivest. It's the fastest
+ cipher in Catacomb. It looks fairly strong (although see the
+ note about churning the context after keying below). And also
+ note that it works in output feedback -- you just XOR the output
+ from RC4 with the plaintext. Never reuse an RC4 key!
+
+ RC4 includes an OFB-like interface which should be familiar. It
+ also includes a pair of strange macros RC4_OPEN and RC4_BYTE.
+ These are used to actually get bytes out of the RC4 generator.
+
+ RC4_OPEN is really a new syntactic form. If `r' is a pointer to
+ an RC4 context, then
+
+ RC4_OPEN(r, <statements>);
+
+ executes <statements> within the opened RC4 context. The
+ significance of this is that the expression RC4_BYTE(x) extracts
+ the next byte from the innermost open context, and stores it in
+ x. The standard RC4 encrypt function is written in terms of
+ RC4_OPEN and RC4_BYTE.
+
+ RC4 makes an excellent and fast random-byte generator.
+
+ RSA Data Security Inc. claim that RC4 is a trade secret of
+ theirs. It doesn't look very secret to me.
+
+
+Generic cipher interfaces
+
+ It can be convenient to implement routines where the cipher to
+ use is a parameter. Hence, Catacomb provides a generic
+ interface to (symmetric) ciphers. The generic interface is
+ defined in <catacomb/gcipher.h>.
+
+ The basic type in the interface is `gcipher', which represents
+ an `instance' of a cipher. You don't see lone cipher objects,
+ only pointers to them, so really everything's in terms of
+ `gcipher *'.
+
+ A `gcipher' is a structured type with one member, called `ops'
+ which points to a collection of functions and other useful
+ information. If `c' is a cipher...
+
+ c->ops->b->name Name of the cipher being used
+
+ c->ops->b->keysz Key size in bytes (or zero for
+ `don't care')
+
+ c->ops->b->blksz Block size in bytes (or zero for
+ `not a block cipher')
+
+ c->ops->encrypt(c, s, t, sz) Encrypt the sz bytes stored in
+ s, and store the ciphertext at
+ t.
+
+ c->ops->decrypt(c, s, t, sz) Like encrypt, only it decrypts.
+
+ c->ops->destroy(c) Destroys the cipher object `r'.
+
+ c->ops->setiv(c, iv) Sets the IV to be `iv' -- must
+ be blksz bytes long.
+
+ c->ops->bdry(c) Inserts a boundary.
+
+ Note that `setiv' and `bdry' aren't implemented by all ciphers
+ so these may be null pointers. It's best to check first.
+
+ Generic cipher instances are created from `generic cipher
+ classes' (type `gccipher' -- note the extra `c'). This contains
+ two members:
+
+ b The `class base' -- this is the object pointed
+ to by `c->ops->b', and contains `name', `keysz'
+ and `blksz' members.
+
+ init The constructor. You give it a pointer to some
+ key data and the key size, and it returns a
+ generic cipher instance.
+
+ Note that new generic ciphers always have zero IVs (if they
+ understand the concept), so you may need to call setiv if you
+ want to reuse keys.
+
+ Always remember to destroy gcipher instances when you're
+ finished with them.
+
+ The generic cipher class for CBC-mode Blowfish is called
+ `blowfish_cbc' -- the others are named similarly. The RC4
+ generic cipher class is called simply `rc4'.
+
+
+--
+[mdw]
+
+\f
+Local variables:
+mode: text
+End: