--- /dev/null
+Catacomb's random number generators
+
+
+ Catacomb contains a large number of pseudorandom number
+ generators. Some are dedicated to generating pseudorandom
+ numbers; others are ciphers in output-feedback mode. Some are
+ cryptographically strong, others are fast or simple. All have
+ been statistically tested and come out OK.
+
+
+The `rand' generator
+
+ Cryptographic applications need a source of true random data.
+ The `rand' generator is Catacomb's true-random-data source.
+
+ The design is my own. I wasn't happy with any existing designs,
+ and I'm still not -- even Counterpane's Yarrow gives me an
+ uneasy feeling[1]. I'm not aware of any cryptanalytic attack
+ against my generator, and I'd very much like to hear if anyone
+ else knows differently.
+
+ The generator's state is held in an object of type `rand_pool'.
+ Data can be added into the pool using the `rand_add' function.
+ Random numbers can be extracted by calling `rand_get'.
+
+ It's possible to attach a `noise acquisition' function to a
+ rand_pool. The function `rand_getgood' extracts a number of
+ bytes from the generator, calling the noise acquisition function
+ if it thinks it's low on randomness. This shouldn't actually be
+ necessary once the generator has been topped up once. It makes
+ the generator very slow.
+
+ It's also possible to key the generator. The output
+ transformation then depends on the key, making it even harder
+ for an adversary to work out anything useful about the
+ generator.
+
+ The underlying transformation used by the generator is
+
+ x' = E_{H(x)}(x)
+
+ i.e., hash, use the hash as a key, and encrypt.
+
+
+ [1] The Yarrow paper states that it offers 160 bits-worth of
+ security, and that an adversary can't distinguish its output
+ from real random bits. I have an attack which takes about
+ 2^64 64-bit outputs and tells the difference between
+ Yarrow-160 and real random data, by analysing the frequency
+ of adjacent 64-bit blocks being equal.
+
+
+Noncryptographic generators
+
+ Two pseudorandom-number generators are supplied which have no
+ cryptographic strength whatever. They are, respectively, a
+ traditional linear-congruential generator (lcrand) and a
+ Fibonacci generator (fibrand).
+
+ The linear-congruential generator has three (fixed) parameters,
+ a, c and p, and calculates
+
+ x[i] = (a x[i - 1] + c) mod p
+
+ The prime p is fairly large (2^32 - 5, in fact), and a and c are
+ carefully chosen. The generator has a period of p - 1. There
+ is one fixed point (i.e., it transforms to itself rather than
+ being in the cycle) which must be avoided.
+
+ The Fibonacci generator works by adding together old results:
+
+ x[i] = (x[i - 24] + x[i - 55]) mod 2^32
+
+ It has a very long period, and is the fastest pseudorandom
+ number generator in Catacomb.
+
+ These two generators are very useful when the cryptographic
+ strength of the output isn't important, for example when
+ choosing random numbers for primality testing.
+
+
+The Blum-Blum-Shub generator
+
+ This is by far the strongest and by even further the slowest
+ pseudorandom-number generator in Catacomb. Its output has been
+ proven to be indistinguishable from random data by *any*
+ polynomial-time test, assuming that factoring large numbers is
+ difficult.
+
+ Use the BBS generator when you're feeling really paranoid, and
+ you've got a few hours with nothing to do.
+
+
+Output-feedback ciphers
+
+ Ciphers in output-feedback mode simply emit a pseudorandom
+ stream. To construct a cipher, you XOR the stream with the
+ plaintext. It's even easier just to use the output as random
+ numbers.
+
+
+Generic interface
+
+ There's a fairly comprehensive generic interface for Catacomb's
+ various generators.
+
+ Each generator provides a function which returns a pointer to a
+ `generic pseudorandom-number generator', of type `grand'.
+
+ A `grand' is a structured type containing one member, `ops',
+ which is a pointer to various useful things. Let `r' be a
+ pointer to a generic pseudorandom-number generator; then,
+
+ r->ops->name The name of the generator.
+
+ r->ops->max One greater than the maximum output of
+ the `raw' function, or zero if it just
+ emits words.
+
+ r->ops->misc(r, op, ...)
+ Performs a miscellaneous operation. The
+ various standard `op' values are
+ described below.
+
+ r->ops->destroy(r) Destroy the generic generator.
+
+ r->ops->raw(r) Return a raw output from the generator.
+ For `lcrand', this is an integer in the
+ range [0, p); for other generators, it's
+ the same as `word'.
+
+ r->ops->byte(r) Return an octet. The result is
+ uniformly distributed over the integers
+ in the interval [0, 256).
+
+ r->ops->word(r) Returns a word. The result is uniformly
+ distributed over the integers in the
+ interval [0, 2^32).
+
+ r->ops->range(r, l) Returns an integer unformly distributed
+ in the interval [0, l), l < 2^32.
+
+ r->ops->fill(r, p, sz) Fill the sz bytes at address p with
+ uniformly distributed bytes (i.e.,
+ integers in the interval [0, 256).
+
+ All of these operations are supported by all generators. Some
+ may be inefficient, however, since they're synthesized using
+ operations more suited to the particular generator. For
+ example, `word' is not efficient on the linear-congruential
+ generator, but is for the Fibonacci generator.
+
+ The misc-ops are as follows:
+
+ misc(r, GRAND_CHECK, op) Return nonzero if `op' is a
+ recognized misc-op for this
+ generator.
+
+ misc(r, GRAND_SEEDINT, s) Seed the generator using the
+ integer s.
+
+ misc(r, GRAND_SEEDUINT32, s) Seed the generator using the
+ 32-bit word s.
+
+ misc(r, GRAND_SEEDBLOCK, p, sz) Seed the generator using the
+ block of sz bytes starting at
+ address p.
+
+ misc(r, GRAND_SEEDMP, m) Seed the generator using the
+ multiprecision integer m.
+
+ misc(r, GRAND_SEEDRAND, r') Seed the generator using some
+ output from the generic pseudo-
+ random number generator r'.
+
+
+Statistical testing
+
+ I generated 10M of data with each of Catacomb's random number
+ generators, and tested them with George Marsaglia's DIEHARD
+ suite. While the random data is not included here, the
+ following trivial Bourne shell script will generate the exact
+ output streams I used, assuming that the supplied `rspit'
+ program is in the current PATH:
+
+ for i in des 3des idea blowfish rc5 rc4; do
+ rspit $i -kseed -z10m -o$i.rand
+ done
+ for i in lcrand fibrand; do
+ rspit $i -p -s0 -z10m -o$i.rand
+ done
+ rspit rand -p -z10m -orand.rand
+ nice rspit bbs -p -s2 -z10m -obbs.rand
+
+ (Note that generating the BBS output data can take over an
+ hour. Make sure you have a good book to read.)
+
+ Looking at the results, I think that they all passed the tests
+ relatively convincingly.
+
+ I've not actually included the DIEHARD output files in the
+ distribution, because they're quite large, and anyone can
+ reproduce my results exactly using the publically available
+ DIEHARD distribution and the code supplied. If you do actually
+ want them, send me some mail and I'll send you a 60K tar.gz
+ file by (approximate) return.
+--
+[mdw]
+
+\f
+Local variables:
+mode: text
+End: