%%% -*-latex-*-
%%%
-%%% $Id: rand.tex,v 1.1 1999/09/03 08:41:13 mdw Exp $
+%%% $Id: rand.tex,v 1.3 1999/10/15 21:05:56 mdw Exp $
%%%
%%% Description of Catacomb's random number generator
%%%
%%%----- Revision history ---------------------------------------------------
%%%
%%% $Log: rand.tex,v $
+%%% Revision 1.3 1999/10/15 21:05:56 mdw
+%%% Add a little more explanatory text for the pool and buffer sizes.
+%%%
+%%% Revision 1.2 1999/10/12 21:00:34 mdw
+%%% Updated. Almost finished, in fact. ;-)
+%%%
%%% Revision 1.1 1999/09/03 08:41:13 mdw
%%% Initial import.
%%%
\let\assign\leftarrow
\let\xor\oplus
\let\bigxor\bigoplus
+\def\cat{\mathbin{\|}}
\title{The Catacomb random number generator}
\author{Mark Wooding, \myemail}
\section{The one-way transformation}
+\label{sec:oneway}
The most novel part of the generator\footnote{I believe this construction to
be novel. If I'm wrong, let me know.} is the one-way transformation which is
knowledge of $B$, or performing one of the obvious exhaustive searches.
-\section{Description of the generator}
+\section{Abstract description of the generator}
The generator is divided into two parts: an \emph{input pool} which
accumulates random input data from the environment, and an \emph{output
buffer} which contains data to be passed to clients of the generator on
request.
+\subsection{The input pool and mixing function}
+
New information is contributed to the generator by mixing it with the input
pool, using a mixing function derived from the Linux random number generator
\cite{linux:devrandom}. The mixing function views the input pool as eight
parallel shift registers. Input data is added one octet at a time. Each bit
of an input octet is mixed with a different shift register.
-Formally, let $I$ be the input pool, with size $n_I$ bytes; let $P(x) = a_0 +
-a_1 x + a_2 x^2 + \cdots + a_{n_I} x^{n_I}$ be a primitive polynomial in
-$\mathrm{GF}(2^{n_I})$ with degree $n_I$; let $i$ be an integer such that $0
-\le i < n_I$, and $r$ be an integer such that $0 \le r < 8$; and let $x$ be
+Formally, let $I$ be the input pool, with size $N_I$ bytes; let $P(x) = a_0 +
+a_1 x + a_2 x^2 + \cdots + a_{N_I} x^{N_I}$ be a primitive polynomial in
+$\mathrm{GF}(2^{N_I})$ with degree $N_I$; let $i$ be an integer such that $0
+\le i < N_I$, and $r$ be an integer such that $0 \le r < 8$; and let $x$ be
an input byte. The result of mixing $x$ with the pool $I$ is calculated as
follows:
\begin{eqlines*}
I'[8j + b] =
\begin{cases}
x\bigl[(r + b) \bmod 8\bigr] \xor
- \bigxor_{0 \le k < n_I}
- a_k I\bigl[8\bigl((j + k) \bmod n_I\bigr) + b\bigr] & if $i = j$ \\
+ \bigxor_{0 \le k < N_I}
+ a_k I\bigl[8\bigl((j + k) \bmod N_I\bigr) + b\bigr] & if $i = j$ \\
I[j + b] & otherwise
\end{cases} \\
- \textrm{for all integers $j$ and $b$ where $0 \le j < n_I$ and
+ \textrm{for all integers $j$ and $b$ where $0 \le j < N_I$ and
$0 \le b < 8$}
\end{spliteqn*}
\\
I \assign I' \qquad
- i \assign (i + 1) \bmod n_I \qquad
+ i \assign (i + 1) \bmod N_I \qquad
r \assign (r + 5) \bmod 8
\end{eqlines*}
Initially, $i$ and $r$ are both zero. The use of 8-bit bytes above is
-arbitrary.
+arbitrary but convenient for modern computers.
+
+The mixing function isn't intended to be cryptographically strong. Its
+purpose is just to hold data without letting too much of the randomness get
+away.
+
+\subsection{The output buffer}
Newly added data doesn't affect the output buffer until a `gating' operation
is performed. This uses the one-way transformation described earlier over
the entire generator state.
Data requested by clients of the generator is read from the output buffer
-$O$. Initially the buffer contains zeroes.
+$O$. Initially the buffer contains zeroes. The output buffer is large
+enough for $N_O$ bits.
+
+When the generator estimates that there's enough entropy in the input pool, a
+\emph{gating} operation is performed, using the one-way function described in
+section~\ref{sec:oneway}. Hash both the input pool and output buffer, and
+then encrypt both using the hash as the key:
+\begin{eqlines*}
+ h = H(I \cat O) \\
+ I \assign E_h(I) \qquad O \assign E_h(O)
+\end{eqlines*}
+If the output buffer is exhausted before the next gating operation, it is
+\emph{stretched} using the one-way function: $O \assign E_{H(O)}(O)$.
+
+\subsection{Other tweaks}
+
+The first $N_S$ bits of this buffer take part in the output transformation
+but are never actually output. They're there to make predicting further
+output from the generator difficult.
+
+Also, optionally, the one-way functions can be keyed. This does, of course,
+beg the question as to where the key comes from. This might be one of those
+things best done the old-fashioned way with a bunch of coins or dice or
+something.
+
+
+\section{The actual implementation}
+
+The Catacomb implementation of the generator uses the following parameters:
+\begin{itemize}
+\item The hash function used in the one-way transformation is RIPEMD-160
+ \cite{rmd160}; the block cipher is Blowfish, using a 160-bit key.
+\item The input pool size $N_I$ is 128 bytes. The output buffer size $N_O$
+ is 512 bytes. The size $N_S$ of the secret part of the output buffer
+ is 160 bits (20 bytes).
+\item The polynomial $P(x)$ used for mixing in new input is
+ $1 + x + x^2 + x^7 + x^{128}$.
+\end{itemize}
+The hash and block cipher are well-known and respected cryptographic
+primitives.
+
+The input pool is rater larger than it strictly needs to be to contain
+`enough' entropy to bring the generator up to the strength of its
+cryptographic primitives. The pool is large to reduce the effect of
+asymptotic behaviour in the amount of entropy in the pool.
+
+The output buffer is large simply to improve performance: Blowfish has a
+heavy key schedule, so it pays to perform fewer rekeyings per byte of data.
+The precise size of 512 bytes was chosen empirically as being about where the
+performance improvement stops being linear with the buffer size on my
+machine.
\begin{thebibliography}{99}