Catacomb's random number generators
Catacomb contains a large number of pseudorandom number
generators. Some are dedicated to generating pseudorandom
numbers; others are ciphers in output-feedback mode. Some are
cryptographically strong, others are fast or simple. All have
been statistically tested and come out OK.
The `rand' generator
Cryptographic applications need a source of true random data.
The `rand' generator is Catacomb's true-random-data source.
The design is my own. I wasn't happy with any existing designs,
and I'm still not -- even Counterpane's Yarrow gives me an
uneasy feeling[1]. I'm not aware of any cryptanalytic attack
against my generator, and I'd very much like to hear if anyone
else knows differently.
The generator's state is held in an object of type `rand_pool'.
Data can be added into the pool using the `rand_add' function.
Random numbers can be extracted by calling `rand_get'.
It's possible to attach a `noise acquisition' function to a
rand_pool. The function `rand_getgood' extracts a number of
bytes from the generator, calling the noise acquisition function
if it thinks it's low on randomness. This shouldn't actually be
necessary once the generator has been topped up once. It makes
the generator very slow.
It's also possible to key the generator. The output
transformation then depends on the key, making it even harder
for an adversary to work out anything useful about the
generator.
The underlying transformation used by the generator is
x' = E_{H(x)}(x)
i.e., hash, use the hash as a key, and encrypt.
[1] The Yarrow paper states that it offers 160 bits-worth of
security, and that an adversary can't distinguish its output
from real random bits. I have an attack which takes about
2^64 64-bit outputs and tells the difference between
Yarrow-160 and real random data, by analysing the frequency
of adjacent 64-bit blocks being equal.
Non-cryptographic generators
Two pseudorandom-number generators are supplied which have no
cryptographic strength whatever. They are, respectively, a
traditional linear-congruential generator (lcrand) and a
Fibonacci generator (fibrand).
The linear-congruential generator has three (fixed) parameters,
a, c and p, and calculates
x[i] = (a x[i - 1] + c) mod p
The prime p is fairly large (2^32 - 5, in fact), and a and c are
carefully chosen. The generator has a period of p - 1. There
is one fixed point (i.e., it transforms to itself rather than
being in the cycle) which must be avoided.
The Fibonacci generator works by adding together old results:
x[i] = (x[i - 24] + x[i - 55]) mod 2^32
It has a very long period, and is the fastest pseudorandom
number generator in Catacomb.
These two generators are very useful when the cryptographic
strength of the output isn't important, for example when
choosing random numbers for primality testing.
The Blum-Blum-Shub generator
This is by far the strongest and by even further the slowest
pseudorandom-number generator in Catacomb. Its output has been
proven to be indistinguishable from random data by *any*
polynomial-time test, assuming that factoring large numbers is
difficult.
Use the BBS generator when you're feeling really paranoid, and
you've got a few hours with nothing to do.
Output-feedback ciphers
Ciphers in output-feedback mode simply emit a pseudorandom
stream. To construct a cipher, you XOR the stream with the
plaintext. It's even easier just to use the output as random
numbers.
Generic interface
There's a fairly comprehensive generic interface for Catacomb's
various generators.
Each generator provides a function which returns a pointer to a
`generic pseudorandom-number generator', of type `grand'.
A `grand' is a structured type containing one member, `ops',
which is a pointer to various useful things. Let `r' be a
pointer to a generic pseudorandom-number generator; then,
r->ops->name The name of the generator.
r->ops->max One greater than the maximum output of
the `raw' function, or zero if it just
emits words.
r->ops->misc(r, op, ...)
Performs a miscellaneous operation. The
various standard `op' values are
described below.
r->ops->destroy(r) Destroy the generic generator.
r->ops->raw(r) Return a raw output from the generator.
For `lcrand', this is an integer in the
range [0, p); for other generators, it's
the same as `word'.
r->ops->byte(r) Return an octet. The result is
uniformly distributed over the integers
in the interval [0, 256).
r->ops->word(r) Returns a word. The result is uniformly
distributed over the integers in the
interval [0, 2^32).
r->ops->range(r, l) Returns an integer uniformly distributed
in the interval [0, l), l < 2^32.
r->ops->fill(r, p, sz) Fill the sz bytes at address p with
uniformly distributed bytes (i.e.,
integers in the interval [0, 256).
All of these operations are supported by all generators. Some
may be inefficient, however, since they're synthesized using
operations more suited to the particular generator. For
example, `word' is not efficient on the linear-congruential
generator, but is for the Fibonacci generator.
The misc-ops are as follows:
misc(r, GRAND_CHECK, op) Return nonzero if `op' is a
recognized misc-op for this
generator.
misc(r, GRAND_SEEDINT, s) Seed the generator using the
integer s.
misc(r, GRAND_SEEDUINT32, s) Seed the generator using the
32-bit word s.
misc(r, GRAND_SEEDBLOCK, p, sz) Seed the generator using the
block of sz bytes starting at
address p.
misc(r, GRAND_SEEDMP, m) Seed the generator using the
multiprecision integer m.
misc(r, GRAND_SEEDRAND, r') Seed the generator using some
output from the generic pseudo-
random number generator r'.
Statistical testing
I generated 10M of data with each of Catacomb's random number
generators, and tested them with George Marsaglia's DIEHARD
suite. While the random data is not included here, the
following trivial Bourne shell script will generate the exact
output streams I used, assuming that the supplied `rspit'
program is in the current PATH:
for i in des 3des idea blowfish rc5 rc4; do
rspit $i -kseed -z10m -o$i.rand
done
for i in lcrand fibrand; do
rspit $i -p -s0 -z10m -o$i.rand
done
rspit rand -p -z10m -orand.rand
nice rspit bbs -p -s2 -z10m -obbs.rand
(Note that generating the BBS output data can take over an
hour. Make sure you have a good book to read.)
Looking at the results, I think that they all passed the tests
relatively convincingly.
I've not actually included the DIEHARD output files in the
distribution, because they're quite large, and anyone can
reproduce my results exactly using the publicly available
DIEHARD distribution and the code supplied. If you do actually
want them, send me some mail and I'll send you a 60K tar.gz
file by (approximate) return.
-- [mdw]
Local variables:
mode: text
End: